Definite slow down (from 160k/sec to 8k/sec with 10 concurrent clients) and plenty of messages like
[07:08:16] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_CRITICAL_OPERATION_TIMEOUT, err=class o.a.i.IgniteException: Checkpoint read lock acquisition has been timed out.]] [07:08:16,915][SEVERE][client-connector-#85][GridCacheDatabaseSharedManager] Checkpoint read lock acquisition has been timed out. class org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointReadLockTimeoutException: Checkpoint read lock acquisition has been timed out. at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.failCheckpointReadLock(GridCacheDatabaseSharedManager.java:1728) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1654) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1819) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1734) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:300) .... [09:04:25,831][SEVERE][tcp-disco-msg-worker-[crd]-#2-#56][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [workerName=db-checkpoint-thread, threadName=db-checkpoint-thread-#79, blockedFor=13s] [09:04:25] Possible failure suppressed accordingly to a configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=db-checkpoint-thread, igniteInstanceName=null, finished=false, heartbeatTs=1615824252073]]] On Mon, Mar 15, 2021 at 2:44 AM Stephen Darlington < [email protected]> wrote: > What happens after you have 150mm records? You say it’s not okay, but you > don’t say what does happen. Does it crash, slow down, what? > > With SQL and persistence, I think you’d probably need more heap space. > Increasing the number of partitions isn’t likely to help. > > On 14 Mar 2021, at 20:14, Sebastien Blind <[email protected]> > wrote: > > Hello Apache Ignite friends, > I am trying to load 1bn records in an ignite cache with persistence > enabled for a poc. Records only consist of an id, a uuid and a state, and > I'd like to lookup by id (this is the key in the cache) and by token (sql > queries for that are fine or another cache keyed by uuid is another option). > > The experiment is running locally on a MacOs (16Gb memory) laptop and I'd > like to use as little memory as possible, and use disk storage as much as > possible. Everything works ok until ~150M with query index enabled, and > ~600M without index or side cache, so I'd be curious about what type of > settings (if any) could help w/ the setup, and what amount of data can a > single node be expected to handle. Restarting my ignite node and the > loading process seems to help and more data can be stuffed in the cache. > > I've tried to play w/ settings like size, number of partitions etc and to > find some info on how to control what's onheap/offheap etc, any > overhead/sizing that could be taken into consideration without much > success. Maybe the experiment is doomed to fail just based on the specs, > but it would be good to understand the constraints.. so any help would be > much appreciated! > > Thanks in advance! > Sebastien > > PS: > JVM is running w/ -Xms4g -Xmx4g -server -XX:MaxMetaspaceSize=256m settings. > > Some snippet of the config: > <property name=*"persistenceEnabled"* value=*"true"* /> > <property name=*"initialSize"* value=*"#{4L * 1024 * 1024 * 1024}"* /> > <property name=*"maxSize" *value=*"#{4L * 1024 * 1024 * 1024}"* /> > <property name=*"pageEvictionMode"* value=*"RANDOM_2_LRU"* /> > > Cache configuration > <property name=*"cacheConfiguration"*> > <bean class=*"org.apache.ignite.configuration.CacheConfiguration"*> > <!-- Set the cache name. --> > <property name=*"name"* value=*"qa_sqr_txn"* /> > <!-- Set the cache mode. --> > <property name=*"cacheMode"* value=*"PARTITIONED"* /> > <property name=*"backups"* value=*"0"* /> > <property name=*"storeKeepBinary"* value=*"true"* /> > > <property name=*"affinity"*> > <bean class= > *"org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction"* > > > <property name=*"partitions"* value=*"8192"* /> > </bean> > </property> > > <!-- Configure query entities --> > <property name=*"queryEntities"*> > <list> > <bean class=*"org.apache.ignite.cache.QueryEntity"*> > <!-- Setting the type of the key --> > <property name=*"keyType"* value=*"java.lang.String"* /> > <property name=*"keyFieldName"* value=*"id"* /> > > <!-- Setting type of the value --> > <property name=*"valueType"* value=*"com.xxx.Txn"* /> > > <property name=*"fields"*> > <map> > <entry key=*"id"* value=*"java.lang.String"* /> > <entry key=*"token"* value=*"java.lang.String"* /> > <entry key=*"state"* value=*"java.lang.String "* /> > </map> > </property> > <!-- > <property name="indexes"> > <list> > <bean class="org.apache.ignite.cache.QueryIndex"> > <constructor-arg value="token" /> > </bean> > </list> > </property> > --> > </bean> > </list> > </property> > </bean> > </property> > > > >
