[jira] [Created] (IGNITE-12089) JVM is halted after this error during rolling restart of a cluster

2019-08-20 Thread temp2 (Jira)
temp2 created IGNITE-12089:
--

 Summary: JVM is halted after this error during rolling restart of 
a cluster
 Key: IGNITE-12089
 URL: https://issues.apache.org/jira/browse/IGNITE-12089
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.6
Reporter: temp2


JVM is halted after this error during rolling restart of a cluster:

excepition is :528-a852-c65782e337f0][2019-08-20 
17:22:10,901][ERROR][ttl-cleanup-worker-#155][] Critical system error detected. 
Will be handled accordingly to configured handler [hnd=class 
o.a.i.failure.StopNodeOrHaltFailure528-a852-c65782e337f0][2019-08-20 
17:22:10,901][ERROR][ttl-cleanup-worker-#155][] Critical system error detected. 
Will be handled accordingly to configured handler [hnd=class 
o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Runtime 
failure on bounds: [lower=PendingRow [], upper=PendingRow 
[org.apache.ignite.IgniteException: Runtime failure on bounds: 
[lower=PendingRow [], upper=PendingRow []] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:971)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:950)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.expire(IgniteCacheOffheapManagerImpl.java:1022)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:197)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:137)
 [ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) 
[ignite-core-2.6.0.jar:2.6.0] at java.lang.Thread.run(Thread.java:745) 
[?:1.8.0_101]Caused by: java.lang.IllegalStateException: Failed to get page IO 
instance (page content is corrupted) at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forVersion(IOVersions.java:83)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forPage(IOVersions.java:95)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:148)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.tree.PendingRow.initKey(PendingRow.java:72)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:118)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:31)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer(BPlusTree.java:4660)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.init(BPlusTree.java:4562)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.access$5300(BPlusTree.java:4501)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetCursor.notFound(BPlusTree.java:2633)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Search.run0(BPlusTree.java:293)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:4816)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$GetPageHandler.run(BPlusTree.java:4801)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.readPage(PageHandler.java:158)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.DataStructure.read(DataStructure.java:332)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1140)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1149)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doFind(BPlusTree.java:1107)
 ~[ignite-core-2.6.0.jar:2.6.0] at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$15800(BPlusTree.java:83)
 ~[ignite

[jira] [Assigned] (IGNITE-12087) Transactional putAll - significant performance drop on big batches of entries.

2019-08-20 Thread Eduard Shangareev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Shangareev reassigned IGNITE-12087:
--

Assignee: Eduard Shangareev

> Transactional putAll - significant performance drop on big batches of entries.
> --
>
> Key: IGNITE-12087
> URL: https://issues.apache.org/jira/browse/IGNITE-12087
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Pavel Pereslegin
>Assignee: Eduard Shangareev
>Priority: Major
>
> After IGNITE-5227 have been fixed I found significant performance drop in 
> putAll operation.
> Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
> After IGNITE-5227 - 130 seconds (~100x slower).
> I checked a different batch size:
> 10_000 - 10 seconds
> 20_000 - 48 seconds
> 30_000 - 130 seconds
> and I was not able to wait for the result of 100_000 entries.
> Reproducer
> {code:java}
> public class CheckPutAll extends GridCommonAbstractTest {
> @Override protected IgniteConfiguration getConfiguration(String 
> igniteInstanceName) throws Exception {
> IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
> CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);
> ccfg.setAtomicityMode(TRANSACTIONAL);
> cfg.setCacheConfiguration(ccfg);
> return cfg;
> }
> @Test
> public void check() throws Exception {
> int cnt = 30_000;
> Map data = new HashMap<>(U.capacity(cnt));
> for (int i = 0; i < cnt; i++)
> data.put(i, i);
> Ignite node0 = startGrid(0);
> IgniteCache cache0 = 
> node0.cache(DEFAULT_CACHE_NAME);
> cache0.putAll(data);
> }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-7285) Add default query timeout

2019-08-20 Thread Ivan Pavlukhin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911729#comment-16911729
 ] 

Ivan Pavlukhin commented on IGNITE-7285:


[~amashenkov], could you please step in and continue the review? Unfortunately, 
for a couple of weeks I have limited access to my computer and cannot do a 
review in a timely manner.

> Add default query timeout
> -
>
> Key: IGNITE-7285
> URL: https://issues.apache.org/jira/browse/IGNITE-7285
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache, sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>Assignee: Saikat Maitra
>Priority: Major
>  Labels: sql-stability
> Fix For: 2.8
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently it's possible to provide timeout only on query level. It would be 
> very useful to have default timeout value provided on cache startup. Let's 
> add {{CacheConfiguration#defaultQueryTimeout}} configuration property.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-7285) Add default query timeout

2019-08-20 Thread Ivan Pavlukhin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911725#comment-16911725
 ] 

Ivan Pavlukhin commented on IGNITE-7285:


Hi [~samaitra],

I suspect that not all places where a timeout should be passed are covered yet. 
For me it looks more natural to calculate a query timeout (taking default one 
into account) upon {{QueryParameters}} initialization. It is worth to consider 
following points for an implementation:
1. Reduce a number of places where _default timeout_ is read from configuration 
(ideally there should be only one place).
2. Be able to disable a timeout for a particular query 
({{SqlFieldsQuery.setTimeout(0)}}) when default non-zero timeout is configured.

> Add default query timeout
> -
>
> Key: IGNITE-7285
> URL: https://issues.apache.org/jira/browse/IGNITE-7285
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache, sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>Assignee: Saikat Maitra
>Priority: Major
>  Labels: sql-stability
> Fix For: 2.8
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently it's possible to provide timeout only on query level. It would be 
> very useful to have default timeout value provided on cache startup. Let's 
> add {{CacheConfiguration#defaultQueryTimeout}} configuration property.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911630#comment-16911630
 ] 

Maxim Muzafarov commented on IGNITE-12080:
--

Folks,

I've looked through the changes and have a few questions regarding 
implementation.
 * Why the static utils class is used for collecting rebalance info? Why not, 
for instance, the DiagnosticProcossor (introduced recently)?
 * After IGNITE-3195 will be merged there is no reason to collect statistics 
about `rebalance topic` it will be replaced with the thread pools.
 * Do you have any benchmarks with the `printRebalanceStatistics` property 
enabled? Since the rebalance procedure can be run 4-8 hours it is necessary to 
check and analyze JVM metrics (GC, used heap etc.) We can have thousands of 
Supply-Demand messages and for each, we are holding in the heap a 
`RebalanceMessageStatistics` until the rebalance procedure finishes.
 * Printed statistics are not in the human-readable format. Is it 
user-friendly? Moreover, it is up to the implementation to print statistics the 
right way in logs. I think we don't need any abbreviations (e.g. 
`writeAliasesRebalanceStatistics`) to decode logs.
 * Do we have TC execution with `printRebalanceStatistics` enabled property on 
all suites? It seems to me we can get a `NullPointerException` for some of the 
cases.
 * Why the `RebalanceMessageStatistics` is needed? I don't think that holding 
`sndMsgTime` for each message will be useful for rebalancing statistic at all. 
The same thing for `rcvMsgTime`.
 * I think `ReceivePartitionStatistics`.`msgSize` will be the same for 98% 
cases. Do we need it?
 * Do we need `PartitionStatistics` at all? Can the same value be obtained from 
metrics `onRebalanceKeyReceived` and the end of the rebalance procedure?

Please, do not merge PR until all the issues will be resolved.

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-9638) .NET: JVM keeps track of CLR Threads, even when they are finished

2019-08-20 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911566#comment-16911566
 ] 

Pavel Tupitsyn commented on IGNITE-9638:


[~ilyak] we could do that, as well as provide a way to detach manually. Both of 
those approaches are bad usability, but maybe they make sense in the short term 
while we work on the proper fix.

> .NET: JVM keeps track of CLR Threads, even when they are finished 
> --
>
> Key: IGNITE-9638
> URL: https://issues.apache.org/jira/browse/IGNITE-9638
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms
>Affects Versions: 2.6
>Reporter: Ilya Kasnacheev
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.8
>
> Attachments: IgniteRepro.zip
>
>
> When you create a Thread in C#, JVM creates corresponding thread 
> "Thread-" which is visible in jstack. When C# joins this thread, it is 
> not removed from JVM and is kept around. This means that jstack may show 
> thousands of threads which are not there. Reproducer is attached. It is 
> presumed that memory will be exhausted eventually.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12088) Cache or template name should be validated before attempt to start

2019-08-20 Thread Pavel Kovalenko (Jira)
Pavel Kovalenko created IGNITE-12088:


 Summary: Cache or template name should be validated before attempt 
to start
 Key: IGNITE-12088
 URL: https://issues.apache.org/jira/browse/IGNITE-12088
 Project: Ignite
  Issue Type: Bug
  Components: cache
Reporter: Pavel Kovalenko
 Fix For: 2.8


If set too long cache name it can be a cause of impossibility to create work 
directory for that cache:

{noformat}
[2019-08-20 
19:35:42,139][ERROR][exchange-worker-#172%node1%][IgniteTestResources] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=NoOpFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=CRITICAL_ERROR, err=class o.a.i.IgniteCheckedException: Failed to 
initialize cache working directory (failed to create, make sure the work folder 
has correct permissions): 
/home/gridgain/projects/incubator-ignite/work/db/node1/cache-CacheConfiguration 
[name=ccfg3staticTemplate*, grpName=null, memPlcName=null, 
storeConcurrentLoadAllThreshold=5, rebalancePoolSize=1, rebalanceTimeout=1, 
evictPlc=null, evictPlcFactory=null, onheapCache=false, sqlOnheapCache=false, 
sqlOnheapCacheMaxSize=0, evictFilter=null, eagerTtl=true, dfltLockTimeout=0, 
nearCfg=null, writeSync=null, storeFactory=null, storeKeepBinary=false, 
loadPrevVal=false, aff=null, cacheMode=PARTITIONED, atomicityMode=null, 
backups=6, invalidate=false, tmLookupClsName=null, rebalanceMode=ASYNC, 
rebalanceOrder=0, rebalanceBatchSize=524288, rebalanceBatchesPrefetchCnt=2, 
maxConcurrentAsyncOps=500, sqlIdxMaxInlineSize=-1, writeBehindEnabled=false, 
writeBehindFlushSize=10240, writeBehindFlushFreq=5000, 
writeBehindFlushThreadCnt=1, writeBehindBatchSize=512, 
writeBehindCoalescing=true, maxQryIterCnt=1024, affMapper=null, 
rebalanceDelay=0, rebalanceThrottle=0, interceptor=null, 
longQryWarnTimeout=3000, qryDetailMetricsSz=0, readFromBackup=true, 
nodeFilter=null, sqlSchema=null, sqlEscapeAll=false, cpOnRead=true, 
topValidator=null, partLossPlc=IGNORE, qryParallelism=1, evtsDisabled=false, 
encryptionEnabled=false, diskPageCompression=null, 
diskPageCompressionLevel=null]0]]
class org.apache.ignite.IgniteCheckedException: Failed to initialize cache 
working directory (failed to create, make sure the work folder has correct 
permissions): 
/home/gridgain/projects/incubator-ignite/work/db/node1/cache-CacheConfiguration 
[name=ccfg3staticTemplate*, grpName=null, memPlcName=null, 
storeConcurrentLoadAllThreshold=5, rebalancePoolSize=1, rebalanceTimeout=1, 
evictPlc=null, evictPlcFactory=null, onheapCache=false, sqlOnheapCache=false, 
sqlOnheapCacheMaxSize=0, evictFilter=null, eagerTtl=true, dfltLockTimeout=0, 
nearCfg=null, writeSync=null, storeFactory=null, storeKeepBinary=false, 
loadPrevVal=false, aff=null, cacheMode=PARTITIONED, atomicityMode=null, 
backups=6, invalidate=false, tmLookupClsName=null, rebalanceMode=ASYNC, 
rebalanceOrder=0, rebalanceBatchSize=524288, rebalanceBatchesPrefetchCnt=2, 
maxConcurrentAsyncOps=500, sqlIdxMaxInlineSize=-1, writeBehindEnabled=false, 
writeBehindFlushSize=10240, writeBehindFlushFreq=5000, 
writeBehindFlushThreadCnt=1, writeBehindBatchSize=512, 
writeBehindCoalescing=true, maxQryIterCnt=1024, affMapper=null, 
rebalanceDelay=0, rebalanceThrottle=0, interceptor=null, 
longQryWarnTimeout=3000, qryDetailMetricsSz=0, readFromBackup=true, 
nodeFilter=null, sqlSchema=null, sqlEscapeAll=false, cpOnRead=true, 
topValidator=null, partLossPlc=IGNORE, qryParallelism=1, evtsDisabled=false, 
encryptionEnabled=false, diskPageCompression=null, 
diskPageCompressionLevel=null]0
at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.checkAndInitCacheWorkDir(FilePageStoreManager.java:769)
at 
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.checkAndInitCacheWorkDir(FilePageStoreManager.java:748)
at 
org.apache.ignite.internal.processors.cache.CachesRegistry.persistCacheConfigurations(CachesRegistry.java:289)
at 
org.apache.ignite.internal.processors.cache.CachesRegistry.registerAllCachesAndGroups(CachesRegistry.java:264)
at 
org.apache.ignite.internal.processors.cache.CachesRegistry.update(CachesRegistry.java:202)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onCacheChangeRequest(CacheAffinitySharedManager.java:850)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onCacheChangeRequest(GridDhtPartitionsExchangeFuture.java:1306)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:846)
at 
org.apache.ignite.internal.processors.c

[jira] [Updated] (IGNITE-12088) Cache or template name should be validated before attempt to start

2019-08-20 Thread Pavel Kovalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Kovalenko updated IGNITE-12088:
-
Labels: usability  (was: )

> Cache or template name should be validated before attempt to start
> --
>
> Key: IGNITE-12088
> URL: https://issues.apache.org/jira/browse/IGNITE-12088
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Pavel Kovalenko
>Priority: Critical
>  Labels: usability
> Fix For: 2.8
>
>
> If set too long cache name it can be a cause of impossibility to create work 
> directory for that cache:
> {noformat}
> [2019-08-20 
> 19:35:42,139][ERROR][exchange-worker-#172%node1%][IgniteTestResources] 
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=CRITICAL_ERROR, err=class o.a.i.IgniteCheckedException: Failed to 
> initialize cache working directory (failed to create, make sure the work 
> folder has correct permissions): 
> /home/gridgain/projects/incubator-ignite/work/db/node1/cache-CacheConfiguration
>  [name=ccfg3staticTemplate*, grpName=null, memPlcName=null, 
> storeConcurrentLoadAllThreshold=5, rebalancePoolSize=1, 
> rebalanceTimeout=1, evictPlc=null, evictPlcFactory=null, 
> onheapCache=false, sqlOnheapCache=false, sqlOnheapCacheMaxSize=0, 
> evictFilter=null, eagerTtl=true, dfltLockTimeout=0, nearCfg=null, 
> writeSync=null, storeFactory=null, storeKeepBinary=false, loadPrevVal=false, 
> aff=null, cacheMode=PARTITIONED, atomicityMode=null, backups=6, 
> invalidate=false, tmLookupClsName=null, rebalanceMode=ASYNC, 
> rebalanceOrder=0, rebalanceBatchSize=524288, rebalanceBatchesPrefetchCnt=2, 
> maxConcurrentAsyncOps=500, sqlIdxMaxInlineSize=-1, writeBehindEnabled=false, 
> writeBehindFlushSize=10240, writeBehindFlushFreq=5000, 
> writeBehindFlushThreadCnt=1, writeBehindBatchSize=512, 
> writeBehindCoalescing=true, maxQryIterCnt=1024, affMapper=null, 
> rebalanceDelay=0, rebalanceThrottle=0, interceptor=null, 
> longQryWarnTimeout=3000, qryDetailMetricsSz=0, readFromBackup=true, 
> nodeFilter=null, sqlSchema=null, sqlEscapeAll=false, cpOnRead=true, 
> topValidator=null, partLossPlc=IGNORE, qryParallelism=1, evtsDisabled=false, 
> encryptionEnabled=false, diskPageCompression=null, 
> diskPageCompressionLevel=null]0]]
> class org.apache.ignite.IgniteCheckedException: Failed to initialize cache 
> working directory (failed to create, make sure the work folder has correct 
> permissions): 
> /home/gridgain/projects/incubator-ignite/work/db/node1/cache-CacheConfiguration
>  [name=ccfg3staticTemplate*, grpName=null, memPlcName=null, 
> storeConcurrentLoadAllThreshold=5, rebalancePoolSize=1, 
> rebalanceTimeout=1, evictPlc=null, evictPlcFactory=null, 
> onheapCache=false, sqlOnheapCache=false, sqlOnheapCacheMaxSize=0, 
> evictFilter=null, eagerTtl=true, dfltLockTimeout=0, nearCfg=null, 
> writeSync=null, storeFactory=null, storeKeepBinary=false, loadPrevVal=false, 
> aff=null, cacheMode=PARTITIONED, atomicityMode=null, backups=6, 
> invalidate=false, tmLookupClsName=null, rebalanceMode=ASYNC, 
> rebalanceOrder=0, rebalanceBatchSize=524288, rebalanceBatchesPrefetchCnt=2, 
> maxConcurrentAsyncOps=500, sqlIdxMaxInlineSize=-1, writeBehindEnabled=false, 
> writeBehindFlushSize=10240, writeBehindFlushFreq=5000, 
> writeBehindFlushThreadCnt=1, writeBehindBatchSize=512, 
> writeBehindCoalescing=true, maxQryIterCnt=1024, affMapper=null, 
> rebalanceDelay=0, rebalanceThrottle=0, interceptor=null, 
> longQryWarnTimeout=3000, qryDetailMetricsSz=0, readFromBackup=true, 
> nodeFilter=null, sqlSchema=null, sqlEscapeAll=false, cpOnRead=true, 
> topValidator=null, partLossPlc=IGNORE, qryParallelism=1, evtsDisabled=false, 
> encryptionEnabled=false, diskPageCompression=null, 
> diskPageCompressionLevel=null]0
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.checkAndInitCacheWorkDir(FilePageStoreManager.java:769)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.checkAndInitCacheWorkDir(FilePageStoreManager.java:748)
>   at 
> org.apache.ignite.internal.processors.cache.CachesRegistry.persistCacheConfigurations(CachesRegistry.java:289)
>   at 
> org.apache.ignite.internal.processors.cache.CachesRegistry.registerAllCachesAndGroups(CachesRegistry.java:264)
>   at 
> org.apache.ignite.internal.processors.cache.CachesRegistry.update(CachesRegistry.java:202)
>   at 
> org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onCacheChangeRequest(CacheAffinitySharedMa

[jira] [Commented] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Eduard Shangareev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911539#comment-16911539
 ] 

Eduard Shangareev commented on IGNITE-12080:


Looks good. Thank you for your contribution.

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-9638) .NET: JVM keeps track of CLR Threads, even when they are finished

2019-08-20 Thread Ilya Kasnacheev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911534#comment-16911534
 ] 

Ilya Kasnacheev commented on IGNITE-9638:
-

Maybe we should have an option where every thread is detached every time we 
need to return?

> .NET: JVM keeps track of CLR Threads, even when they are finished 
> --
>
> Key: IGNITE-9638
> URL: https://issues.apache.org/jira/browse/IGNITE-9638
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms
>Affects Versions: 2.6
>Reporter: Ilya Kasnacheev
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.8
>
> Attachments: IgniteRepro.zip
>
>
> When you create a Thread in C#, JVM creates corresponding thread 
> "Thread-" which is visible in jstack. When C# joins this thread, it is 
> not removed from JVM and is kept around. This means that jstack may show 
> thousands of threads which are not there. Reproducer is attached. It is 
> presumed that memory will be exhausted eventually.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-5227) StackOverflowError in GridCacheMapEntry#checkOwnerChanged()

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-5227:
---
Fix Version/s: 2.8

> StackOverflowError in GridCacheMapEntry#checkOwnerChanged()
> ---
>
> Key: IGNITE-5227
> URL: https://issues.apache.org/jira/browse/IGNITE-5227
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 1.6
>Reporter: Alexey Goncharuk
>Assignee: Stepachev Maksim
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A simple test reproducing this error:
> {code}
> /**
>  * @throws Exception if failed.
>  */
> public void testBatchUnlock() throws Exception {
>startGrid(0);
>grid(0).createCache(new CacheConfiguration Integer>(DEFAULT_CACHE_NAME)
> .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL));
> try {
> final CountDownLatch releaseLatch = new CountDownLatch(1);
> IgniteInternalFuture fut = GridTestUtils.runAsync(new 
> Callable() {
> @Override public Object call() throws Exception {
> IgniteCache cache = grid(0).cache(null);
> Lock lock = cache.lock("key");
> try {
> lock.lock();
> releaseLatch.await();
> }
> finally {
> lock.unlock();
> }
> return null;
> }
> });
> Map putMap = new LinkedHashMap<>();
> putMap.put("key", "trigger");
> for (int i = 0; i < 10_000; i++)
> putMap.put("key-" + i, "value");
> IgniteCache asyncCache = 
> grid(0).cache(null).withAsync();
> asyncCache.putAll(putMap);
> IgniteFuture resFut = asyncCache.future();
> Thread.sleep(1000);
> releaseLatch.countDown();
> fut.get();
> resFut.get();
> }
> finally {
> stopAllGrids();
> }
> {code}
> We should replace a recursive call with a simple iteration over the linked 
> list.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-5227) StackOverflowError in GridCacheMapEntry#checkOwnerChanged()

2019-08-20 Thread Dmitriy Pavlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911525#comment-16911525
 ] 

Dmitriy Pavlov commented on IGNITE-5227:


Folks, why do you resolve the ticket without fix version? Some day it becomes a 
pain in the neck for release manager to find out where the fix is.

> StackOverflowError in GridCacheMapEntry#checkOwnerChanged()
> ---
>
> Key: IGNITE-5227
> URL: https://issues.apache.org/jira/browse/IGNITE-5227
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 1.6
>Reporter: Alexey Goncharuk
>Assignee: Stepachev Maksim
>Priority: Critical
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A simple test reproducing this error:
> {code}
> /**
>  * @throws Exception if failed.
>  */
> public void testBatchUnlock() throws Exception {
>startGrid(0);
>grid(0).createCache(new CacheConfiguration Integer>(DEFAULT_CACHE_NAME)
> .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL));
> try {
> final CountDownLatch releaseLatch = new CountDownLatch(1);
> IgniteInternalFuture fut = GridTestUtils.runAsync(new 
> Callable() {
> @Override public Object call() throws Exception {
> IgniteCache cache = grid(0).cache(null);
> Lock lock = cache.lock("key");
> try {
> lock.lock();
> releaseLatch.await();
> }
> finally {
> lock.unlock();
> }
> return null;
> }
> });
> Map putMap = new LinkedHashMap<>();
> putMap.put("key", "trigger");
> for (int i = 0; i < 10_000; i++)
> putMap.put("key-" + i, "value");
> IgniteCache asyncCache = 
> grid(0).cache(null).withAsync();
> asyncCache.putAll(putMap);
> IgniteFuture resFut = asyncCache.future();
> Thread.sleep(1000);
> releaseLatch.countDown();
> fut.get();
> resFut.get();
> }
> finally {
> stopAllGrids();
> }
> {code}
> We should replace a recursive call with a simple iteration over the linked 
> list.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12087) Transactional putAll - significant performance drop on big batches of entries.

2019-08-20 Thread Dmitriy Pavlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911520#comment-16911520
 ] 

Dmitriy Pavlov commented on IGNITE-12087:
-

[~mstepachev] could you take a look at this issue?

> Transactional putAll - significant performance drop on big batches of entries.
> --
>
> Key: IGNITE-12087
> URL: https://issues.apache.org/jira/browse/IGNITE-12087
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Pavel Pereslegin
>Priority: Major
>
> After IGNITE-5227 have been fixed I found significant performance drop in 
> putAll operation.
> Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
> After IGNITE-5227 - 130 seconds (~100x slower).
> I checked a different batch size:
> 10_000 - 10 seconds
> 20_000 - 48 seconds
> 30_000 - 130 seconds
> and I was not able to wait for the result of 100_000 entries.
> Reproducer
> {code:java}
> public class CheckPutAll extends GridCommonAbstractTest {
> @Override protected IgniteConfiguration getConfiguration(String 
> igniteInstanceName) throws Exception {
> IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
> CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);
> ccfg.setAtomicityMode(TRANSACTIONAL);
> cfg.setCacheConfiguration(ccfg);
> return cfg;
> }
> @Test
> public void check() throws Exception {
> int cnt = 30_000;
> Map data = new HashMap<>(U.capacity(cnt));
> for (int i = 0; i < cnt; i++)
> data.put(i, i);
> Ignite node0 = startGrid(0);
> IgniteCache cache0 = 
> node0.cache(DEFAULT_CACHE_NAME);
> cache0.putAll(data);
> }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-9638) .NET: JVM keeps track of CLR Threads, even when they are finished

2019-08-20 Thread Pavel Tupitsyn (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911517#comment-16911517
 ] 

Pavel Tupitsyn commented on IGNITE-9638:


[~ilyak] I don't think so - again, the problem is that you can't detach some 
other thread, only the current one.

> .NET: JVM keeps track of CLR Threads, even when they are finished 
> --
>
> Key: IGNITE-9638
> URL: https://issues.apache.org/jira/browse/IGNITE-9638
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms
>Affects Versions: 2.6
>Reporter: Ilya Kasnacheev
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.8
>
> Attachments: IgniteRepro.zip
>
>
> When you create a Thread in C#, JVM creates corresponding thread 
> "Thread-" which is visible in jstack. When C# joins this thread, it is 
> not removed from JVM and is kept around. This means that jstack may show 
> thousands of threads which are not there. Reproducer is attached. It is 
> presumed that memory will be exhausted eventually.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12087) Transactional putAll - significant performance drop on big batches of entries.

2019-08-20 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-12087:
--
Description: 
After IGNITE-5227 have been fixed I found significant performance drop in 
putAll operation.

Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
After IGNITE-5227 - 130 seconds (~100x slower).

I checked a different batch size:
10_000 - 10 seconds
20_000 - 48 seconds
30_000 - 130 seconds

and I was not able to wait for the result of 100_000 entries.

Reproducer
{code:java}
public class CheckPutAll extends GridCommonAbstractTest {
@Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);

ccfg.setAtomicityMode(TRANSACTIONAL);

cfg.setCacheConfiguration(ccfg);

return cfg;
}

@Test
public void check() throws Exception {
int cnt = 30_000;

Map data = new HashMap<>(U.capacity(cnt));

for (int i = 0; i < cnt; i++)
data.put(i, i);

Ignite node0 = startGrid(0);

IgniteCache cache0 = node0.cache(DEFAULT_CACHE_NAME);

cache0.putAll(data);
}
}{code}


  was:
After IGNITE-5227 have been fixed I found significant performance drop in 
putAll operation.

Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
After IGNITE-5227 - 130 seconds (~100x slower).

I checked a different batch size:
10_000 - 10 seconds
20_000 - 48 seconds
30_000 - 130 seconds

and I was not able to wait for the result of 100_000 entries.

Reproducer
{code:java}
public class CheckPutAll extends GridCommonAbstractTest {
@Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);

ccfg.setAtomicityMode(TRANSACTIONAL);

cfg.setCacheConfiguration(ccfg);

return cfg;
}

@Test
public void check() throws Exception {
int cnt = 30_000;

// Prepare data.
Map data = new HashMap<>(U.capacity(cnt));

for (int i = 0; i < cnt; i++)
data.put(i, i);

// Start 1 node.
Ignite node0 = startGrid(0);

IgniteCache cache0 = node0.cache(DEFAULT_CACHE_NAME);

// Load data.
cache0.putAll(data);
}
}{code}



> Transactional putAll - significant performance drop on big batches of entries.
> --
>
> Key: IGNITE-12087
> URL: https://issues.apache.org/jira/browse/IGNITE-12087
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Pavel Pereslegin
>Priority: Major
>
> After IGNITE-5227 have been fixed I found significant performance drop in 
> putAll operation.
> Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
> After IGNITE-5227 - 130 seconds (~100x slower).
> I checked a different batch size:
> 10_000 - 10 seconds
> 20_000 - 48 seconds
> 30_000 - 130 seconds
> and I was not able to wait for the result of 100_000 entries.
> Reproducer
> {code:java}
> public class CheckPutAll extends GridCommonAbstractTest {
> @Override protected IgniteConfiguration getConfiguration(String 
> igniteInstanceName) throws Exception {
> IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
> CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);
> ccfg.setAtomicityMode(TRANSACTIONAL);
> cfg.setCacheConfiguration(ccfg);
> return cfg;
> }
> @Test
> public void check() throws Exception {
> int cnt = 30_000;
> Map data = new HashMap<>(U.capacity(cnt));
> for (int i = 0; i < cnt; i++)
> data.put(i, i);
> Ignite node0 = startGrid(0);
> IgniteCache cache0 = 
> node0.cache(DEFAULT_CACHE_NAME);
> cache0.putAll(data);
> }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12087) Transactional putAll - significant performance drop on big batches of entries.

2019-08-20 Thread Pavel Pereslegin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-12087:
--
Description: 
After IGNITE-5227 have been fixed I found significant performance drop in 
putAll operation.

Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
After IGNITE-5227 - 130 seconds (~100x slower).

I checked a different batch size:
10_000 - 10 seconds
20_000 - 48 seconds
30_000 - 130 seconds

and I was not able to wait for the result of 100_000 entries.

Reproducer
{code:java}
public class CheckPutAll extends GridCommonAbstractTest {
@Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);

ccfg.setAtomicityMode(TRANSACTIONAL);

cfg.setCacheConfiguration(ccfg);

return cfg;
}

@Test
public void check() throws Exception {
int cnt = 30_000;

// Prepare data.
Map data = new HashMap<>(U.capacity(cnt));

for (int i = 0; i < cnt; i++)
data.put(i, i);

// Start 1 node.
Ignite node0 = startGrid(0);

IgniteCache cache0 = node0.cache(DEFAULT_CACHE_NAME);

// Load data.
cache0.putAll(data);
}
}{code}


  was:
After IGNITE-5227 have been fixed I found significant performance drop in 
putAll operation.

Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
After IGNITE-5227 - 130 seconds (~100x slower).

I checked a different batch size:
10_000 - 10 seconds
20_000 - 48 seconds
30_000 - 130 seconds

and I was not able to wait for the result of 100_000 entries.

Reproducer
{code:java}
public class CheckPutAll extends GridCommonAbstractTest {
@Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);

ccfg.setAtomicityMode(TRANSACTIONAL);

cfg.setCacheConfiguration(ccfg);

return cfg;
}

@Test
public void check() throws Exception {
int cnt = 30_000;

// Prepare data.
Map data = new HashMap<>(U.capacity(cnt));

for (int i = 0; i < cnt; i++)
data.put(i, i);

// Start 1 node.
Ignite node0 = startGrid(0);

node0.cluster().active(true);

node0.cluster().baselineAutoAdjustTimeout(0);

IgniteCache cache0 = node0.cache(DEFAULT_CACHE_NAME);

// Load data.
cache0.putAll(data);
}
}{code}



> Transactional putAll - significant performance drop on big batches of entries.
> --
>
> Key: IGNITE-12087
> URL: https://issues.apache.org/jira/browse/IGNITE-12087
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Reporter: Pavel Pereslegin
>Priority: Major
>
> After IGNITE-5227 have been fixed I found significant performance drop in 
> putAll operation.
> Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
> After IGNITE-5227 - 130 seconds (~100x slower).
> I checked a different batch size:
> 10_000 - 10 seconds
> 20_000 - 48 seconds
> 30_000 - 130 seconds
> and I was not able to wait for the result of 100_000 entries.
> Reproducer
> {code:java}
> public class CheckPutAll extends GridCommonAbstractTest {
> @Override protected IgniteConfiguration getConfiguration(String 
> igniteInstanceName) throws Exception {
> IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
> CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);
> ccfg.setAtomicityMode(TRANSACTIONAL);
> cfg.setCacheConfiguration(ccfg);
> return cfg;
> }
> @Test
> public void check() throws Exception {
> int cnt = 30_000;
> // Prepare data.
> Map data = new HashMap<>(U.capacity(cnt));
> for (int i = 0; i < cnt; i++)
> data.put(i, i);
> // Start 1 node.
> Ignite node0 = startGrid(0);
> IgniteCache cache0 = 
> node0.cache(DEFAULT_CACHE_NAME);
> // Load data.
> cache0.putAll(data);
> }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12087) Transactional putAll - significant performance drop on big batches of entries.

2019-08-20 Thread Pavel Pereslegin (Jira)
Pavel Pereslegin created IGNITE-12087:
-

 Summary: Transactional putAll - significant performance drop on 
big batches of entries.
 Key: IGNITE-12087
 URL: https://issues.apache.org/jira/browse/IGNITE-12087
 Project: Ignite
  Issue Type: Bug
  Components: cache
Reporter: Pavel Pereslegin


After IGNITE-5227 have been fixed I found significant performance drop in 
putAll operation.

Insertion of 30_000 entries before IGNITE-5227 took ~1 second.
After IGNITE-5227 - 130 seconds (~100x slower).

I checked a different batch size:
10_000 - 10 seconds
20_000 - 48 seconds
30_000 - 130 seconds

and I was not able to wait for the result of 100_000 entries.

Reproducer
{code:java}
public class CheckPutAll extends GridCommonAbstractTest {
@Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

CacheConfiguration ccfg = new CacheConfiguration(DEFAULT_CACHE_NAME);

ccfg.setAtomicityMode(TRANSACTIONAL);

cfg.setCacheConfiguration(ccfg);

return cfg;
}

@Test
public void check() throws Exception {
int cnt = 30_000;

// Prepare data.
Map data = new HashMap<>(U.capacity(cnt));

for (int i = 0; i < cnt; i++)
data.put(i, i);

// Start 1 node.
Ignite node0 = startGrid(0);

node0.cluster().active(true);

node0.cluster().baselineAutoAdjustTimeout(0);

IgniteCache cache0 = node0.cache(DEFAULT_CACHE_NAME);

// Load data.
cache0.putAll(data);
}
}{code}




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12071) Test failures after IGNITE-9562 fix in IGFS suite

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12071:

Fix Version/s: (was: 2.7.6)
   2.8

> Test failures after IGNITE-9562 fix in IGFS suite
> -
>
> Key: IGNITE-12071
> URL: https://issues.apache.org/jira/browse/IGNITE-12071
> Project: Ignite
>  Issue Type: Test
>Reporter: Dmitriy Pavlov
>Assignee: Eduard Shangareev
>Priority: Blocker
> Fix For: 2.8
>
>
> https://lists.apache.org/thread.html/50375927a1375189c0aeec7dcaabc43ba83b7acee94524a3483d0c1b@%3Cdev.ignite.apache.org%3E
> Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
> planned to the 2.7.6 it is a blocker for the release 
> *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFilePrimary 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8890685422557348790&branch=%3Cdefault%3E&tab=testDetails
>  *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFileDualExclusion 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=3724804704021179739&branch=%3Cdefault%3E&tab=testDetails
>  Changes may lead to failure were done by 
>  - eduard shangareev  
> https://ci.ignite.apache.org/viewModification.html?modId=889258



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (IGNITE-12071) Test failures after IGNITE-9562 fix in IGFS suite

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov resolved IGNITE-12071.
-
Resolution: Fixed

Not reproduced in 2.7.6, resolving the issue, committed as part of another fix 
for master, here 
https://github.com/apache/ignite/pull/6765/files#diff-3b0297f8e0e757b6b5ede921d629c6b5R608

> Test failures after IGNITE-9562 fix in IGFS suite
> -
>
> Key: IGNITE-12071
> URL: https://issues.apache.org/jira/browse/IGNITE-12071
> Project: Ignite
>  Issue Type: Test
>Reporter: Dmitriy Pavlov
>Assignee: Eduard Shangareev
>Priority: Blocker
> Fix For: 2.8
>
>
> https://lists.apache.org/thread.html/50375927a1375189c0aeec7dcaabc43ba83b7acee94524a3483d0c1b@%3Cdev.ignite.apache.org%3E
> Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
> planned to the 2.7.6 it is a blocker for the release 
> *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFilePrimary 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8890685422557348790&branch=%3Cdefault%3E&tab=testDetails
>  *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFileDualExclusion 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=3724804704021179739&branch=%3Cdefault%3E&tab=testDetails
>  Changes may lead to failure were done by 
>  - eduard shangareev  
> https://ci.ignite.apache.org/viewModification.html?modId=889258



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-11470) Views don't show in Dbeaver

2019-08-20 Thread Yury Gerzhedovich (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911419#comment-16911419
 ] 

Yury Gerzhedovich commented on IGNITE-11470:


[~amashenkov], [~pkouznet], guys, could you please review the patch.

> Views don't show in Dbeaver
> ---
>
> Key: IGNITE-11470
> URL: https://issues.apache.org/jira/browse/IGNITE-11470
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Reporter: Yury Gerzhedovich
>Assignee: Yury Gerzhedovich
>Priority: Major
>  Labels: iep-29
> Fix For: 2.8
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> At Database navigator tab we can see no a views. As of now we should see at 
> least SQL system views.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-11470) Views don't show in Dbeaver

2019-08-20 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911417#comment-16911417
 ] 

Ignite TC Bot commented on IGNITE-11470:


{panel:title=Branch: [pull/6456/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4520011&buildTypeId=IgniteTests24Java8_RunAll]

> Views don't show in Dbeaver
> ---
>
> Key: IGNITE-11470
> URL: https://issues.apache.org/jira/browse/IGNITE-11470
> Project: Ignite
>  Issue Type: Task
>  Components: sql
>Reporter: Yury Gerzhedovich
>Assignee: Yury Gerzhedovich
>Priority: Major
>  Labels: iep-29
> Fix For: 2.8
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> At Database navigator tab we can see no a views. As of now we should see at 
> least SQL system views.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (IGNITE-11393) Create IgniteLinkTaglet.toString() implementation for Java9+

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov reassigned IGNITE-11393:
---

Assignee: (was: Dmitriy Pavlov)

> Create IgniteLinkTaglet.toString() implementation for Java9+
> 
>
> Key: IGNITE-11393
> URL: https://issues.apache.org/jira/browse/IGNITE-11393
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Dmitriy Pavlov
>Priority: Major
>
> New implementation was added according to the new Java API for Javadoc.
> But the main method kept empty, need to implement toString() to process 
> IgniteLink annotation



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12082) [Release] Update versions for pre-build DEB/RPM and describe how to set these versions

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12082:

Summary: [Release] Update versions for pre-build DEB/RPM and describe how 
to set these versions  (was: [Release] Automate version assignment to pre-build 
DEB/RPM or describe how to set packages version to RC version)

> [Release] Update versions for pre-build DEB/RPM and describe how to set these 
> versions
> --
>
> Key: IGNITE-12082
> URL: https://issues.apache.org/jira/browse/IGNITE-12082
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Pavlov
>Assignee: Dmitriy Pavlov
>Priority: Major
> Fix For: 2.7.6
>
>
> Problem:
> https://ci.ignite.apache.org/viewLog.html?buildTypeId=Releases_ApacheIgniteMain_ReleaseBuild&buildId=4513186&branch_Releases_ApacheIgniteMain_ReleaseBuild=ignite-2.7.6
> RC 0 for 2.7.6. the build was successful, but versions for packages remain 
> unchanged
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process does not 
> require Release manager to update versions, but pre-build DEB & RPM keeps 
> version from the previous release.
> Solution 1 (manual):
> We need to add a new step
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-4.1.Updatereleasebranchversionsandyearincopyrightmessages
> e.g. 4.1.4, where will ask a release manager to update versions.
> May be similar with commit 
> https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=84c2dac5103a448bdaee88cb8290fd6e05a435bb
> Solution 2 (automatic)
> patch ./scripts/update-versions.sh to set packages version to current project 
> version. This will not require any actions from the release manager since 
> versions will be updated at step 4.1 with other assemblies versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (IGNITE-12002) [TTL] Some expired data remains in memory even with eager TTL enabled

2019-08-20 Thread Philippe Anes (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philippe Anes resolved IGNITE-12002.

Fix Version/s: 2.8
   Resolution: Fixed

> [TTL] Some expired data remains in memory even with eager TTL enabled
> -
>
> Key: IGNITE-12002
> URL: https://issues.apache.org/jira/browse/IGNITE-12002
> Project: Ignite
>  Issue Type: Bug
>  Components: cache, general
>Affects Versions: 2.7
> Environment: Running on MacOS 10.12.6
> OpenJDK 11
> Ignite v2.7.0
>  
>Reporter: Philippe Anes
>Priority: Major
> Fix For: 2.8
>
>
> Create an ignite client (in client mode false) and put some data (10k 
> entries/values) to it with very small expiration time (~20s) and TTL enabled.
>  Each time the thread is running it'll remove all the entries that expired, 
> but after few attempts this thread is not removing all the expired entries, 
> some of them are staying in memory and are not removed by this thread 
> execution.
>  That means we got some expired data in memory, and it's something we want to 
> avoid.
> Please can you confirm that is a real issue or just misuse/configuration of 
> my test?
> Thanks for your feedback.
>  
> To reproduce:
> Git repo: [https://github.com/panes/ignite-sample]
> Run MyIgniteLoadRunnerTest.run() to reproduce the issue described on top.
> (Global setup: Writing 10k entries of 64octets each with TTL 10s)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Dmitriy Pavlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911357#comment-16911357
 ] 

Dmitriy Pavlov edited comment on IGNITE-12061 at 8/20/19 1:41 PM:
--

[~zstan] thank you for contribution,
[~jooger], [~Pavlukhin], thank you for review, 
[~amashenkov], thank you for review and merging ticket.

[~zstan] unfortunately I can't automatically cherry-pick fix to 2.7.6. Could 
you please prepare 2-7-6 based branch and create new PR for it? 

An example is https://issues.apache.org/jira/browse/IGNITE-9562 and PR 
https://github.com/apache/ignite/pull/6781

Also, since there is a risk of introducing bugs during merge, I suggest running 
TC Run All on the resulting branch.


was (Author: dpavlov):
[~amashenkov], thank you for review and merging ticket.

[~zstan] unfortunately I can't automatically cherry-pick fix to 2.7.6. Could 
you please prepare 2-7-6 based branch and create new PR for it? 

An example is https://issues.apache.org/jira/browse/IGNITE-9562 and PR 
https://github.com/apache/ignite/pull/6781

Also, since there is a risk of introducing bugs during merge, I suggest running 
TC Run All on the resulting branch.

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Dmitriy Pavlov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911357#comment-16911357
 ] 

Dmitriy Pavlov commented on IGNITE-12061:
-

[~amashenkov], thank you for review and merging ticket.

[~zstan] unfortunately I can't automatically cherry-pick fix to 2.7.6. Could 
you please prepare 2-7-6 based branch and create new PR for it? 

An example is https://issues.apache.org/jira/browse/IGNITE-9562 and PR 
https://github.com/apache/ignite/pull/6781

Also, since there is a risk of introducing bugs during merge, I suggest running 
TC Run All on the resulting branch.

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Reopened] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov reopened IGNITE-12061:
-

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-10619) Add support files transmission between nodes over connection via CommunicationSpi

2019-08-20 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911274#comment-16911274
 ] 

Maxim Muzafarov commented on IGNITE-10619:
--

Accoding to benchmark there is no pefromance drop.

||FULL_SYNC||master d9fde7ee1562a84f26e92cd94dd0ac09738414ff||IGNITE-10619 
7d1b87a67d6e115bff41eb80e7755a070b3f32ac||delta (%)||
|IgnitePutBenchmark|446568.01|449137.06|0.58|
|IgnitePutGetBenchmark|326479.18|332820.69|1.91|


> Add support files transmission between nodes over connection via 
> CommunicationSpi
> -
>
> Key: IGNITE-10619
> URL: https://issues.apache.org/jira/browse/IGNITE-10619
> Project: Ignite
>  Issue Type: Sub-task
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
>  Labels: iep-28
> Fix For: 2.8
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> Partition preloader must support cache partition file relocation from one 
> cluster node to another (the zero copy algorithm [1] assume to be used by 
> default). To achieve this, the file transfer machinery must be implemented at 
> Apache Ignite over Communication SPI.
> _CommunicationSpi_
> Ignite's Comminication SPI must support to:
> * establishing channel connections to the remote node to an arbitrary topic 
> (GridTopic) with predefined processing policy;
> * listening incoming channel creation events and registering connection 
> handlers on the particular node;
> * an arbitrary set of channel parameters on connection handshake;
> _FileTransmitProcessor_
> The file transmission manager must support to:
> * using different approaches of incoming data handling – buffered and direct 
> (zero-copy approach of FileChannel#transferTo);
> * transferring data by chunks of predefined size with saving intermediate 
> results;
> * re-establishing connection if an error occurs and continue file 
> upload\download;
> * limiting connection bandwidth (upload and download) at runtime;



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-9638) .NET: JVM keeps track of CLR Threads, even when they are finished

2019-08-20 Thread Ilya Kasnacheev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911266#comment-16911266
 ] 

Ilya Kasnacheev commented on IGNITE-9638:
-

Is it possible to have a thread garbage collection, where threads will be 
detached periodically?

> .NET: JVM keeps track of CLR Threads, even when they are finished 
> --
>
> Key: IGNITE-9638
> URL: https://issues.apache.org/jira/browse/IGNITE-9638
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms
>Affects Versions: 2.6
>Reporter: Ilya Kasnacheev
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.8
>
> Attachments: IgniteRepro.zip
>
>
> When you create a Thread in C#, JVM creates corresponding thread 
> "Thread-" which is visible in jstack. When C# joins this thread, it is 
> not removed from JVM and is kept around. This means that jstack may show 
> thousands of threads which are not there. Reproducer is attached. It is 
> presumed that memory will be exhausted eventually.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12083) [Release] Change release scripts according pre-build DEB/RPM folders

2019-08-20 Thread Peter Ivanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911263#comment-16911263
 ] 

Peter Ivanov commented on IGNITE-12083:
---

Looks good to me.

> [Release] Change release scripts according pre-build DEB/RPM folders
> 
>
> Key: IGNITE-12083
> URL: https://issues.apache.org/jira/browse/IGNITE-12083
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Pavlov
>Assignee: Dmitriy Pavlov
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Problem: 
> svn: E02: Can't stat '/mnt/c/dev_env/release-2.7.6-rc0/packaging/pkg': No 
> such file or directory
> Solution 1:
> change release scripts accordingly, PR#5
> Solution 2:
> change folder from 'packages' to packaging on Teamcity.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911260#comment-16911260
 ] 

Ignite TC Bot commented on IGNITE-12061:


{panel:title=Branch: [pull/6770/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4518671&buildTypeId=IgniteTests24Java8_RunAll]

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-10808) Discovery message queue may build up with TcpDiscoveryMetricsUpdateMessage

2019-08-20 Thread Sergey Chugunov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911258#comment-16911258
 ] 

Sergey Chugunov commented on IGNITE-10808:
--

[~dmekhanikov],

Along with MetricsUpdate message your change also affects 
TcpDiscoveryClientAckResponse which no longer be processed with priority to 
other messages. This may be risky. 
Could you check what are the consequences for client nodes' stability if acks 
are delivered to them with some delay?

Thanks.

> Discovery message queue may build up with TcpDiscoveryMetricsUpdateMessage
> --
>
> Key: IGNITE-10808
> URL: https://issues.apache.org/jira/browse/IGNITE-10808
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Stanislav Lukyanov
>Assignee: Denis Mekhanikov
>Priority: Major
>  Labels: discovery
> Fix For: 2.8
>
> Attachments: IgniteMetricsOverflowTest.java
>
>
> A node receives a new metrics update message every `metricsUpdateFrequency` 
> milliseconds, and the message will be put at the top of the queue (because it 
> is a high priority message).
> If processing one message takes more than `metricsUpdateFrequency` then 
> multiple `TcpDiscoveryMetricsUpdateMessage` will be in the queue. A long 
> enough delay (e.g. caused by a network glitch or GC) may lead to the queue 
> building up tens of metrics update messages which are essentially useless to 
> be processed. Finally, if processing a message on average takes a little more 
> than `metricsUpdateFrequency` (even for a relatively short period of time, 
> say, for a minute due to network issues) then the message worker will end up 
> processing only the metrics updates and the cluster will essentially hang.
> Reproducer is attached. In the test, the queue first builds up and then very 
> slowly being teared down, causing "Failed to wait for PME" messages.
> Need to change ServerImpl's SocketReader not to put another metrics update 
> message to the top of the queue if it already has one (or replace the one at 
> the top with new one).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12071) Test failures after IGNITE-9562 fix in IGFS suite

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12071:

Ignite Flags:   (was: Docs Required)

> Test failures after IGNITE-9562 fix in IGFS suite
> -
>
> Key: IGNITE-12071
> URL: https://issues.apache.org/jira/browse/IGNITE-12071
> Project: Ignite
>  Issue Type: Test
>Reporter: Dmitriy Pavlov
>Assignee: Eduard Shangareev
>Priority: Blocker
> Fix For: 2.7.6
>
>
> https://lists.apache.org/thread.html/50375927a1375189c0aeec7dcaabc43ba83b7acee94524a3483d0c1b@%3Cdev.ignite.apache.org%3E
> Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
> planned to the 2.7.6 it is a blocker for the release 
> *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFilePrimary 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8890685422557348790&branch=%3Cdefault%3E&tab=testDetails
>  *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFileDualExclusion 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=3724804704021179739&branch=%3Cdefault%3E&tab=testDetails
>  Changes may lead to failure were done by 
>  - eduard shangareev  
> https://ci.ignite.apache.org/viewModification.html?modId=889258



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-9562) Destroyed cache that resurrected on an old offline node breaks PME

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-9562:
---
Labels: 2.7.6-rc1  (was: )

> Destroyed cache that resurrected on an old offline node breaks PME
> --
>
> Key: IGNITE-9562
> URL: https://issues.apache.org/jira/browse/IGNITE-9562
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.5
>Reporter: Pavel Kovalenko
>Assignee: Eduard Shangareev
>Priority: Critical
>  Labels: 2.7.6-rc1
> Fix For: 2.7.6
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Given:
> 2 nodes, persistence enabled.
> 1) Stop 1 node
> 2) Destroy cache through client
> 3) Start stopped node
> When the stopped node joins to cluster it starts all caches that it has seen 
> before stopping.
> If that cache was cluster-widely destroyed it leads to breaking the crash 
> recovery process or PME.
> Root cause - we don't start/collect caches from the stopped node on another 
> part of a cluster.
> In case of PARTITIONED cache mode that scenario breaks crash recovery:
> {noformat}
> java.lang.AssertionError: AffinityTopologyVersion [topVer=-1, minorTopVer=0]
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:696)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2449)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:679)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restorePartitionStates(GridCacheDatabaseSharedManager.java:2445)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:2321)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1568)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1308)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1255)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:766)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2577)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2457)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In case of REPLICATED cache mode that scenario breaks PME coordinator process:
> {noformat}
> [2018-09-12 
> 18:50:36,407][ERROR][sys-#148%distributed.CacheStopAndRessurectOnOldNodeTest0%][GridCacheIoManager]
>  Failed to process message [senderId=4b6fd0d4-b756-4a9f-90ca-f0ee2511, 
> messageType=class 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionsSingleMessage]
> java.lang.AssertionError: 3080586
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.clientTopology(GridCachePartitionExchangeManager.java:815)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.updatePartitionSingleMap(GridDhtPartitionsExchangeFuture.java:3621)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:2439)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$100(GridDhtPartitionsExchangeFuture.java:137)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2261)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2249)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:353)
>   at 

[jira] [Updated] (IGNITE-9562) Destroyed cache that resurrected on an old offline node breaks PME

2019-08-20 Thread Dmitriy Pavlov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-9562:
---
Fix Version/s: (was: 2.8)

> Destroyed cache that resurrected on an old offline node breaks PME
> --
>
> Key: IGNITE-9562
> URL: https://issues.apache.org/jira/browse/IGNITE-9562
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.5
>Reporter: Pavel Kovalenko
>Assignee: Eduard Shangareev
>Priority: Critical
> Fix For: 2.7.6
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Given:
> 2 nodes, persistence enabled.
> 1) Stop 1 node
> 2) Destroy cache through client
> 3) Start stopped node
> When the stopped node joins to cluster it starts all caches that it has seen 
> before stopping.
> If that cache was cluster-widely destroyed it leads to breaking the crash 
> recovery process or PME.
> Root cause - we don't start/collect caches from the stopped node on another 
> part of a cluster.
> In case of PARTITIONED cache mode that scenario breaks crash recovery:
> {noformat}
> java.lang.AssertionError: AffinityTopologyVersion [topVer=-1, minorTopVer=0]
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:696)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2449)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:679)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restorePartitionStates(GridCacheDatabaseSharedManager.java:2445)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:2321)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1568)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1308)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1255)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:766)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2577)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2457)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In case of REPLICATED cache mode that scenario breaks PME coordinator process:
> {noformat}
> [2018-09-12 
> 18:50:36,407][ERROR][sys-#148%distributed.CacheStopAndRessurectOnOldNodeTest0%][GridCacheIoManager]
>  Failed to process message [senderId=4b6fd0d4-b756-4a9f-90ca-f0ee2511, 
> messageType=class 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionsSingleMessage]
> java.lang.AssertionError: 3080586
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.clientTopology(GridCachePartitionExchangeManager.java:815)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.updatePartitionSingleMap(GridDhtPartitionsExchangeFuture.java:3621)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:2439)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$100(GridDhtPartitionsExchangeFuture.java:137)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2261)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2249)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:353)
>   at 
> org.apache.ignite.internal.

[jira] [Commented] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Stanilovsky Evgeny (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911221#comment-16911221
 ] 

Stanilovsky Evgeny commented on IGNITE-12061:
-

[~amashenkov] can you merge it ? 
thanks !

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Stanilovsky Evgeny (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911218#comment-16911218
 ] 

Stanilovsky Evgeny commented on IGNITE-12061:
-

*Release notes:*
Fixed a bug with the inability to change the inline_size of an existing index 
after drop and further recreation with different one.

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-20 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911200#comment-16911200
 ] 

Ignite TC Bot commented on IGNITE-12061:


{panel:title=Branch: [pull/6770/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4518671&buildTypeId=IgniteTests24Java8_RunAll]

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-9562) Destroyed cache that resurrected on an old offline node breaks PME

2019-08-20 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911183#comment-16911183
 ] 

Ignite TC Bot commented on IGNITE-9562:
---

{panel:title=Branch: [pull/6781/head] Base: [ignite-2.7.6] : Possible Blockers 
(6)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform C++ (Linux Clang){color} [[tests 0 Exit Code , Failure 
on metric |https://ci.ignite.apache.org/viewLog.html?buildId=4516785]]

{color:#d04437}Platform .NET (Inspections)*{color} [[tests 0 Failure on metric 
|https://ci.ignite.apache.org/viewLog.html?buildId=4516787]]

{color:#d04437}Platform C++ (Linux)*{color} [[tests 0 Exit Code , Failure on 
metric |https://ci.ignite.apache.org/viewLog.html?buildId=4516789]]

{color:#d04437}PDS 1{color} [[tests 
2|https://ci.ignite.apache.org/viewLog.html?buildId=4516793]]
* IgnitePdsTestSuite: IgnitePdsDestroyCacheTest.testDestroyCachesAbruptly - 
Test has low fail rate in base branch 0,0% and is not flaky
* IgnitePdsTestSuite: IgnitePdsDestroyCacheTest.testDestroyGroupCachesAbruptly 
- Test has low fail rate in base branch 0,0% and is not flaky

{color:#d04437}Platform C++ (Win x64 / Release){color} [[tests 0 
BuildFailureOnMessage 
|https://ci.ignite.apache.org/viewLog.html?buildId=4516797]]

{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4514248&buildTypeId=IgniteTests24Java8_RunAll]

> Destroyed cache that resurrected on an old offline node breaks PME
> --
>
> Key: IGNITE-9562
> URL: https://issues.apache.org/jira/browse/IGNITE-9562
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.5
>Reporter: Pavel Kovalenko
>Assignee: Eduard Shangareev
>Priority: Critical
> Fix For: 2.8, 2.7.6
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Given:
> 2 nodes, persistence enabled.
> 1) Stop 1 node
> 2) Destroy cache through client
> 3) Start stopped node
> When the stopped node joins to cluster it starts all caches that it has seen 
> before stopping.
> If that cache was cluster-widely destroyed it leads to breaking the crash 
> recovery process or PME.
> Root cause - we don't start/collect caches from the stopped node on another 
> part of a cluster.
> In case of PARTITIONED cache mode that scenario breaks crash recovery:
> {noformat}
> java.lang.AssertionError: AffinityTopologyVersion [topVer=-1, minorTopVer=0]
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:696)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2449)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:679)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restorePartitionStates(GridCacheDatabaseSharedManager.java:2445)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:2321)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1568)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1308)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1255)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:766)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2577)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2457)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In case of REPLICATED cache mode that scenario breaks PME coordinator process:
> {noformat}
> [2018-09-12 
> 18:50:36,407][ERROR][sys-#148%distributed.CacheStopAndRessurectOnOldNodeTest0%][GridCacheIoManager]
>  Failed to process message [senderId=4b6fd0d4-b756-4a9f-90ca-f0ee2511, 
> messageType=class 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionsSingleMessage]
> java.lang.AssertionError: 3080586
>

[jira] [Commented] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911129#comment-16911129
 ] 

Kirill Tkalenko commented on IGNITE-12080:
--

[~EdShangGG] Please code review.

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-12080:
-
Reviewer: Eduard Shangareev  (was: Vladislav Pyatkov)

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1696#comment-1696
 ] 

Vladislav Pyatkov commented on IGNITE-12080:


[~ktkale...@gridgain.com] looks good to me.

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12086) Ignite pod keeps crashing and failed to recover the node

2019-08-20 Thread radhakrupa (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radhakrupa updated IGNITE-12086:

Description: 
Ignite has been deployed on the kubernets , there are 3 replicas of server pod. 
The pods were up and running fine for 9 days.  We have created 180 inventory 
tables and 204 transactional tables. The data has been inserted using the 
PyIgnite client using the cache.put() method.  This is a very slow operation 
because PyIgnite is very slow.  Each insert is committed one at a time, so it 
is not able to do bulk-style inserts. The PyIgnite was inserting about 20 of 
the inventory tables simultaneously (20 different threads/processes).

The cluster was nowhere stable after 9days, one of the pod crashed and failed 
to recover. Below is the error log:

{"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"ERROR","system":"ignite-service","time":"2019-08-16T17:13:34,769Z","logger":"GridCachePartitionExchangeManager","timezone":"UTC","log":"Failed
 to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage 
[reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0, cachesToClose=null, 
startCaches=[BgwService]] java.lang.NullPointerException| at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)|
 at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)|
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)| 
at java.lang.Thread.run(Thread.java:748)"}

{"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"WARN","system":"ignite-service","time":"2019-08-16T17:13:36,724Z","logger":"GridCacheDatabaseSharedManager","timezone":"UTC","log":"Ignite
 node stopped in the middle of checkpoint. Will restore memory state and finish 
checkpoint on node start."}

The error report file and ignite-config.xml has been attached for your info.

Heap Memory and RAM Configurations are as below on each of the ignite server 
container:

Heap Memory: 32gb

RAM: 64GB

Default memory region: 

cpu: 4

Persistence volume

wal_storage_size: 10GB

persistence_storage_size: 10GB

 

  was:
Ignite has been deployed on the kubernets , there are 3 replicas of server pod. 
The pods were up and running fine for 9 days.  We have created 180 invent 
tables and 204 transactional tables. The data has been inserted using the 
PyIgnite client using the cache.put() method.  This is a very slow operation 
because PyIgnite is very slow.  Each insert is committed one at a time, so it 
is not able to do bulk-style inserts. The PyIgnite was inserting about 20 of 
the inventory tables simultaneously (20 different threads/processes).

The cluster was nowhere stable after 9days, one of the pod crashed and failed 
to recover. Below is the error log:

{"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"ERROR","system":"ignite-service","time":"2019-08-16T17:13:34,769Z","logger":"GridCachePartitionExchangeManager","timezone":"UTC","log":"Failed
 to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage 
[reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0, cachesToClose=null, 
startCaches=[BgwService]] java.lang.NullPointerException| at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)|
 at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)|
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)| 
at java.lang.Thread.run(Thread.java:748)"} 
\{"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"WARN","system":"ignite-service","time":"2019-08-16T17:13:36,724Z","logger":"GridCacheDatabaseSharedManager","timezone":"UTC","log":"Ignite
 node stopped in the middle of checkpoint. Will restore memory state and finish 
checkpoint on node start."}

The error report file and ignite-

[jira] [Created] (IGNITE-12086) Ignite pod keeps crashing and failed to recover the node

2019-08-20 Thread radhakrupa (Jira)
radhakrupa created IGNITE-12086:
---

 Summary: Ignite pod keeps crashing and failed to recover the node 
 Key: IGNITE-12086
 URL: https://issues.apache.org/jira/browse/IGNITE-12086
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.7
Reporter: radhakrupa
 Attachments: hs_err_pid116.log, ignite-config.xml

Ignite has been deployed on the kubernets , there are 3 replicas of server pod. 
The pods were up and running fine for 9 days.  We have created 180 invent 
tables and 204 transactional tables. The data has been inserted using the 
PyIgnite client using the cache.put() method.  This is a very slow operation 
because PyIgnite is very slow.  Each insert is committed one at a time, so it 
is not able to do bulk-style inserts. The PyIgnite was inserting about 20 of 
the inventory tables simultaneously (20 different threads/processes).

The cluster was nowhere stable after 9days, one of the pod crashed and failed 
to recover. Below is the error log:

{"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"ERROR","system":"ignite-service","time":"2019-08-16T17:13:34,769Z","logger":"GridCachePartitionExchangeManager","timezone":"UTC","log":"Failed
 to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage 
[reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0, cachesToClose=null, 
startCaches=[BgwService]] java.lang.NullPointerException| at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)|
 at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)|
 at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)|
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)| 
at java.lang.Thread.run(Thread.java:748)"} 
\{"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"WARN","system":"ignite-service","time":"2019-08-16T17:13:36,724Z","logger":"GridCacheDatabaseSharedManager","timezone":"UTC","log":"Ignite
 node stopped in the middle of checkpoint. Will restore memory state and finish 
checkpoint on node start."}

The error report file and ignite-config.xml has been attached for your info.

Heap Memory and RAM Configurations are as below on each of the ignite server 
container:

Heap Memory: 32gb

RAM: 64GB

Default memory region: 

cpu: 4

Persistence volume

wal_storage_size: 10GB

persistence_storage_size: 10GB

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911065#comment-16911065
 ] 

Kirill Tkalenko edited comment on IGNITE-12080 at 8/20/19 7:53 AM:
---

[~v.pyatkov] Please code review.


was (Author: ktkale...@gridgain.com):
[~antonovsergey93] Please code review.

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Kirill Tkalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-12080:
-
Reviewer: Vladislav Pyatkov  (was: Sergey Antonov)

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12080) Add extended logging for rebalance

2019-08-20 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911062#comment-16911062
 ] 

Ignite TC Bot commented on IGNITE-12080:


{panel:title=Branch: [pull/6785/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *--> Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4506939&buildTypeId=IgniteTests24Java8_RunAll]

> Add extended logging for rebalance
> --
>
> Key: IGNITE-12080
> URL: https://issues.apache.org/jira/browse/IGNITE-12080
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>
> We should log all information about finished rebalance on demander node.
> I'd have in log:
> h3. Total information:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, number of supplied entries, number of bytes, duraton of getting 
> and processing partitions from supplier)
> h3. Information per cache group:
> # Rebalance duration, rebalance start time/rebalance finish time
> # How many partitions were processed in each topic (number of paritions, 
> number of entries, number of bytes)
> # How many nodes were suppliers in rebalance (nodeId, number of supplied 
> paritions, list of partition ids with PRIMARY/BACKUP flag, number of supplied 
> entries, number of bytes, duraton of getting and processing partitions from 
> supplier)
> # Information about each partition distribution (list of nodeIds with 
> primary/backup flag and marked supplier nodeId)
> h3. Information per supplier node:
> # How many paritions were requested: 
> #* Total number
> #* Primary/backup distribution (number of primary partitions, number of 
> backup partitions)
> #* Total number of entries
> #* Total size partitions in bytes
> # How many paritions were requested per cache group:
> #* Number of requested partitions
> #* Number of entries in partitions
> #* Total size of partitions in bytes
> #* List of requested partitions with size in bytes, count entries, primary or 
> backup partition flag



--
This message was sent by Atlassian Jira
(v8.3.2#803003)