Re: Cache spreading to new nodes

2019-08-14 Thread Denis Mekhanikov
Marco,

Rebalance mode set to NONE means that your cache won’t be rebalanced at all 
unless you trigger it manually.
I think, it’s better not to set it, because otherwise if you don’t trigger the 
rebalance, then only one node will store the cache.

Also the backup filter specified in the affinity function doesn’t seem correct 
to me. It’s always true, since your node filter accepts only those nodes, that 
are in the nodesForOptimization list.

What does fetchNodes() method do?
The recommended way to implement node filters is to check custom node’s 
attributes using an AttributeNodeFilter 
.

Partition map exchange is a process that happens after every topology change. 
Nodes exchange information about partitions distribution of caches. So, you 
can’t prevent it from happening.
The message, that you see is a symptom and not a cause.

Denis


> On 13 Aug 2019, at 09:50, Marco Bernagozzi  wrote:
> 
> Hi, I did some more digging and discovered that the issue seems to be: 
> 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture:
>  Completed partition exchange 
> 
> Is there any way to disable or limit the partition exchange? 
> 
> Best, 
> Marco 
> 
> On Mon, 12 Aug 2019 at 16:59, Andrei Aleksandrov  
> wrote:
> Hi,
> 
> Could you share the whole reproducer with all configurations and required 
> methods?
> 
> BR,
> Andrei
> 
> 8/12/2019 4:48 PM, Marco Bernagozzi пишет:
>> I have a set of nodes, and I want to be able to set a cache in specific 
>> nodes. It works, but whenever I turn on a new node the cache is 
>> automatically spread to that node, which then causes errors like: 
>> Failed over job to a new node ( I guess that there was a computation going 
>> on in a node that shouldn't have computed that, and was shut down in the 
>> meantime). 
>> 
>> I don't know if I'm doing something wrong here or I'm missing something. 
>> As I understand it, NodeFilter and Affinity are equivalent in my case 
>> (Affinity is a node filter which also creates rules on where can the cache 
>> spread from a given node?). With rebalance mode set to NONE, shouldn't the 
>> cache be spread on the "nodesForOptimization" nodes, according to either the 
>> node filter or the affinityFunction? 
>> 
>> Here's my code: 
>> 
>> List nodesForOptimization = fetchNodes(); 
>> 
>> CacheConfiguration graphCfg = new 
>> CacheConfiguration<>(graphCacheName); 
>> graphCfg = graphCfg.setCacheMode(CacheMode.REPLICATED) 
>> .setBackups(nodesForOptimization.size() - 1) 
>> .setAtomicityMode(CacheAtomicityMode.ATOMIC) 
>> .setRebalanceMode(CacheRebalanceMode.NONE) 
>> .setStoreKeepBinary(true) 
>> .setCopyOnRead(false) 
>> .setOnheapCacheEnabled(false) 
>> .setNodeFilter(u -> nodesForOptimization.contains(u.id())) 
>> .setAffinity( 
>> new RendezvousAffinityFunction( 
>> 1024, 
>> (c1, c2) -> nodesForOptimization.contains(c1.id()) && 
>> nodesForOptimization.contains(c2.id()) 
>> ) 
>> ) 
>> 
>> .setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);



Re: IgniteCache.destroy() taking long time

2019-08-14 Thread Denis Mekhanikov
Folks,

Partition map exchange (PME) will actually happen in both cases: PARTITIONED 
and REPLICATED.
We need to understand, which part of the destroy is the longest.

If you enable INFO logs, then you’ll see messages about partition map exchange 
happening when you destroy caches.
Check, whether PME takes most time, or it’s fast and something else is blocking 
the destruction.

Could you tell, how many nodes you have in your cluster?

Denis

> On 14 Aug 2019, at 18:11, Alexander Kor  wrote:
> 
> Hi,
> Can you please share your cache configuration.  How many nodes do you 
> have in your cluster?
> If you are running in PARTITONED mode then some exchange of information 
> will occur.  
> More details here: https://apacheignite.readme.io/docs/cache-modes 
> 
> Do you have a reproducer project?  
> Thanks, Alex
>   
> 
> On Wed, Aug 14, 2019 at 1:22 AM Shravya Nethula 
>  > wrote:
> Hi, 
> 
> I have created a cache using the following API: 
> IgniteCache cache = (IgniteCache) 
> ignite.getOrCreateCache(cacheCfg); 
> 
> Now when i try to delete the cache using IgniteCache.destroy() API, it is 
> taking about 12-13 seconds. 
> 
> Why is it taking more execution time? Will there be any exchange of cache 
> information among the nodes whenever a cache is deleted? 
> Is there any way in which, the execution time can be optimized? 
> 
> Regards, 
> Shravya Nethula.
> 
> 
> 
> Regards,
> Shravya Nethula,
> BigData Developer,
> 
> Hyderabad.
> 



Re: IgniteCache.destroy() taking long time

2019-08-14 Thread Alexander Kor
Hi,
Can you please share your cache configuration.  How many nodes do you
have in your cluster?
If you are running in PARTITONED mode then some exchange of information
will occur.
More details here: https://apacheignite.readme.io/docs/cache-modes
Do you have a reproducer project?
Thanks, Alex


On Wed, Aug 14, 2019 at 1:22 AM Shravya Nethula <
shravya.neth...@aline-consulting.com> wrote:

> Hi,
>
> I have created a cache using the following API:
> IgniteCache cache = (IgniteCache)
> ignite.getOrCreateCache(cacheCfg);
>
> Now when i try to delete the cache using IgniteCache.destroy() API, it is
> taking about 12-13 seconds.
>
> Why is it taking more execution time? Will there be any exchange of cache
> information among the nodes whenever a cache is deleted?
> Is there any way in which, the execution time can be optimized?
>
> Regards,
> Shravya Nethula.
>
>
> Regards,
>
> Shravya Nethula,
>
> BigData Developer,
>
>
> Hyderabad.
>
>


Re: [EXTERNAL] Re: Replace or Put after PutAsync causes Ignite to hang

2019-08-14 Thread e.llull
Hi guys, 

We are also facing a similar problem, if not the same. Our main difference
with the initial reproducer is that we are using the Thick Client. We
applied the suggested fix of setting the SynchronizationContext, but we also
perform a GetAsync after the initial PutAsync. Also, I added a loop around
the Replace because sometimes it takes several iterations to block. 

Here is the code: 
using System; 
using System.Collections.Generic; 
using System.Threading; 
using System.Threading.Tasks; 
using Apache.Ignite.Core; 
using Apache.Ignite.Core.Cache.Configuration; 
using Apache.Ignite.Core.Transactions; 

namespace IgniteHangTest 
{ 
class Program : IDisposable 
{ 
protected readonly IIgnite server; 

protected readonly IIgnite client; 

public static async Task Main(string[] args) 
{ 
SynchronizationContext.SetSynchronizationContext(new
ThreadPoolSynchronizationContext()); 

using (var program = new Program()) 
{ 
await program.Run(); 
} 
} 

public Program() { 
server = Ignition.Start(IgniteConfiguration("server")); 
server.GetOrCreateCache(new
CacheConfiguration("TestCache") 
{ 
AtomicityMode = CacheAtomicityMode.Transactional,   
  
}); 

var clientConfiguration = IgniteConfiguration("client"); 
clientConfiguration.ClientMode = true; 
client = Ignition.Start(clientConfiguration); 
} 

private async Task Run() { 
var cache = client.GetCache("TestCache"); 

Console.WriteLine("Put initial value"); 
await cache.PutAsync(0, "Test"); 

Console.WriteLine("Get initial value"); 
string initialValue = await cache.GetAsync(0);  // if removed,
it works 

Console.WriteLine("Entering Replace loop"); 
for(int i = 0; i < 100; i++) 
{ 
cache.Replace(0, "Replace " + i);  // It blocks here 
Console.WriteLine("Loop: i = {0}", i); 
} 

Console.WriteLine("End"); 
} 

public void Dispose() { 
Ignition.Stop("client", true); 
Ignition.Stop("server", true); 
} 

private IgniteConfiguration IgniteConfiguration(string instanceName) 
{ 
return new IgniteConfiguration 
{ 
IgniteInstanceName = instanceName, 
JvmOptions = new List { "-DIGNITE_QUIET=false", }, 
TransactionConfiguration = new TransactionConfiguration 
{ 
DefaultTimeout = TimeSpan.FromSeconds(5), 
DefaultTransactionConcurrency =
TransactionConcurrency.Optimistic, 
DefaultTransactionIsolation =
TransactionIsolation.Serializable 
}, 
}; 
} 
} 

class ThreadPoolSynchronizationContext : SynchronizationContext { } 
} 

In the reproducer we are starting two Ignite nodes, one as the server and
one with ClientMode = true. This is only in the reproducer, in the real use
case the server Ignite node is started in a different machine but the
problem also arises with the "external" server Ignite node. 

If the line `string initialValue = await cache.GetAsync(0);` is removed the
programs finishes successfully. 

In the console, the relevant logs are: 

Critical system error detected. Will be handled accordingly to configured
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=sys-stripe-6,
igniteInstanceName=client, finished=false, heartbeatTs=1565776844696]]] 
class org.apache.ignite.IgniteException: GridWorker [name=sys-stripe-6,
igniteInstanceName=client, finished=false, heartbeatTs=1565776844696] 
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
 
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
 
at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
 
at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) 
at
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
 
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
at java.lang.Thread.run(Thread.java:748) 

And the stack trace of the sys-stripe-6 thread is: 
Thread [name="sys-stripe-6-#52%client%", id=82, state=WAITING, blockCnt=0,
waitCnt=11] 
at sun.misc.Unsafe.park(Native Method) 
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) 
at

IgniteQueue.removeAll() throwing NPE

2019-08-14 Thread colinc
I am using IgniteQueue to store some POJOs in memory which are removed at a
15 minute interval after some processing. During the processing, elements
are added and removed from the queue multiple times using removeAll() api.
Below is my queue configuration - 

@Override
public IgniteQueue create(QueueName queueName, int capacity) {
return ignite.queue(queueName.getQueueName(),// Queue name.
capacity,   // Queue capacity. 0 for
unbounded queue.
getCollectionConfiguration());
}

private CollectionConfiguration getCollectionConfiguration() {
CollectionConfiguration colCfg = new CollectionConfiguration();
colCfg.setCollocated(true);
colCfg.setCacheMode(REPLICATED);
colCfg.setAtomicityMode(TRANSACTIONAL);

return colCfg;
}

Recently, we have sapprted receiving the below NPE - 

2019-08-09 18:18:39,241 ERROR [Inbound-Main-Pool-13] [TransactionId:
e5b5bfe3-5246-4d54-a4d6-acd550240e13 Request ID - 27845] [ APP=Server,
ACTION=APP_PROCESS, USER=tsgops ] ProcessWorkflowProcessor - Error while
processing CLIENT process
class org.apache.ignite.IgniteException: Failed to serialize object
[typeName=LinkedList]
   at
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:990)
   at
org.apache.ignite.internal.processors.datastructures.GridCacheQueueAdapter$QueueIterator.remove(GridCacheQueueAdapter.java:687)
   at
java.util.AbstractCollection.removeAll(AbstractCollection.java:376)
   at
org.apache.ignite.internal.processors.datastructures.GridCacheQueueProxy.removeAll(GridCacheQueueProxy.java:180)
   at
com.me.app.service.support.APPOrderProcessIgniteQueueService.removeAll(APPOrderProcessIgniteQueueService.java:63)
   at
com.me.app.service.support.APPOrderContextProcessInputManager.removeAllFromCurrentProcessing(APPOrderContextProcessInputManager.java:201)
   at
com.me.app.service.support.APPOrderContextProcessInputManager.lambda$removeAll$3(APPOrderContextProcessInputManager.java:100)
   at java.lang.Iterable.forEach(Iterable.java:75)
   at
com.me.app.service.support.APPOrderContextProcessInputManager.removeAll(APPOrderContextProcessInputManager.java:100)
   at
com.me.app.service.support.APPOrderContextProcessInputManager.removeAll(APPOrderContextProcessInputManager.java:90)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.processOrders(ProcessWorkflowProcessor.java:602)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.lambda$null$13(ProcessWorkflowProcessor.java:405)
   at java.util.HashMap.forEach(HashMap.java:1289)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.lambda$null$14(ProcessWorkflowProcessor.java:368)
   at java.util.HashMap.forEach(HashMap.java:1289)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.lambda$null$15(ProcessWorkflowProcessor.java:354)
   at java.util.HashMap.forEach(HashMap.java:1289)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.lambda$null$16(ProcessWorkflowProcessor.java:345)
   at java.util.HashMap.forEach(HashMap.java:1289)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.lambda$executeProcess$17(ProcessWorkflowProcessor.java:337)
   at java.util.HashMap.forEach(HashMap.java:1289)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.executeProcess(ProcessWorkflowProcessor.java:330)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.executeProcess(ProcessWorkflowProcessor.java:302)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.lambda$processProcessFromQueue$6(ProcessWorkflowProcessor.java:282)
   at
com.me.app.locking.support.IgniteLockingService.execute(IgniteLockingService.java:39)
   at
com.me.app.locking.support.IgniteLockingService.execute(IgniteLockingService.java:68)
   at
com.me.app.processor.support.ProcessWorkflowProcessor.processProcessFromQueue(ProcessWorkflowProcessor.java:281)
   at
com.me.app.facade.listener.support.APPProcessEventListener.listen(APPProcessEventListener.java:49)
   at
com.me.app.facade.listener.support.APPProcessEventListener.listen(APPProcessEventListener.java:19)
   at
com.me.app.common.listener.support.AbstractEventListener.onMessage(AbstractEventListener.java:44)
   at
com.me.app.common.listener.support.AbstractEventListener$$FastClassBySpringCGLIB$$f1379f74.invoke()
   at
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
   at
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721)
   at