[jira] [Created] (IGNITE-12842) Cache "IGNITE_DAEMON" system properties in isDaemon call
Sunny Chan created IGNITE-12842: --- Summary: Cache "IGNITE_DAEMON" system properties in isDaemon call Key: IGNITE-12842 URL: https://issues.apache.org/jira/browse/IGNITE-12842 Project: Ignite Issue Type: Improvement Components: cache Affects Versions: 2.7.6 Reporter: Sunny Chan In our performance tuning exercise using JFR, we have observed that when accessing the cache, Ignite will repeatedly called System.getProperty in GridKernalContextImpl.isDaemon() !isDaemon.jpg! This is generated with the earlier version of Ignite, and when I examine the latest code it also behaving the same way -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12719) Allow users to configuring Ignite Thread Pool's Core thread count/max thread count/etc
Sunny Chan created IGNITE-12719: --- Summary: Allow users to configuring Ignite Thread Pool's Core thread count/max thread count/etc Key: IGNITE-12719 URL: https://issues.apache.org/jira/browse/IGNITE-12719 Project: Ignite Issue Type: Bug Environment: We are running Ignite cluster on bare metal on a relatively high core count machine (4x10 cores 20 threads), and looking some of the thread pool initialization code: {{(IgnitionEx.java)}} {{sysExecSvc = *new* IgniteThreadPoolExecutor(}}{{ "sys", }}{{cfg.getIgniteInstanceName(),}}{{ cfg.getSystemThreadPoolSize(),}}{{ cfg.getSystemThreadPoolSize(),}}{{ *_DFLT_THREAD_KEEP_ALIVE_TIME_*, }}{{*new* LinkedBlockingQueue(),}}{{ GridIoPolicy.*_SYSTEM_POOL_*);}} Notice that the core thread pool size is equals to the max thread pool settings, which is by default same as the number of CPU cores. And in our cases, we won’t be reusing any threads until we have enough request coming in to fill 80 threads. Also, we might want to tune the thread keep alive time to improve thread reuse. We would like to propose to change ignite so that users can configure the core thread pool size in these Ignite thread pools. Reporter: Sunny Chan -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12120) Change log level in GridCacheWritebehindStore
Sunny Chan created IGNITE-12120: --- Summary: Change log level in GridCacheWritebehindStore Key: IGNITE-12120 URL: https://issues.apache.org/jira/browse/IGNITE-12120 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.7.5 Reporter: Sunny Chan In the [GridCacheWriteBehindStore|https://github.com/apache/ignite/blob/7e73098d4d6e3d5f78326cb11dac7e083a2312dd/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/store/GridCacheWriteBehindStore.java#L893], when the updateStore failed to write to underlying store, it logs this as error: {{LT.error(log, e, "Unable to update underlying store: " + store);}} After this line logged the error, it would return false so that it would retry the store (by returning false). While later on in the updatStore function, when the writeCache overflows, it would log this: {{log.warning("Failed to update store (value will be lost as current buffer size is greater " + …}} then it will remove the failed entry. I think the severity of the log messages is not right, as the fail update would still be retried. So I propose to change the log severity level so that the first one would be a warn, and second one would be error -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-7382) Unknown Pair exception if work directory is removed after cluster shutdown
Sunny Chan created IGNITE-7382: -- Summary: Unknown Pair exception if work directory is removed after cluster shutdown Key: IGNITE-7382 URL: https://issues.apache.org/jira/browse/IGNITE-7382 Project: Ignite Issue Type: Bug Affects Versions: 2.3 Environment: Redhat Linux 7.4 Reporter: Sunny Chan To reproduce, try this: 1) Start a server node 2) Start a node in client mode, connect to server node 3) shutdown the server node and _leave the client node running_ 4) remove Ignite work directory for the shutdown server node 5) Client node reconnects automatically, send a service call request 6) Exception occurs on Server, unable to deserialize org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2 {noformat} 2018-01-05 00:00:26.027 processors.job.GridJobWorker [svc-#273%clog%] ERROR - Failed to initialize job [jobId=839fef0c061-ad77f485-d677-4694-8d0c-e780f16be9b7, ses=GridJobSessionImpl [ses=GridTaskSessionImpl [taskName=o.a.i.i.processors. service.GridServiceProxy$ServiceProxyCallable, dep=LocalDeployment [super=GridDeployment [ts=1515076515280, depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6, clsLdrId=3e4f891c061-5235b03b-87c3-4c29-a576-1b44936b0c11, user Ver=0, loc=true, sampleClsName=java.lang.String, pendingUndeploy=false, undeployed=false, usage=0]], taskClsName=o.a.i.i.processors.service.GridServiceProxy$ServiceProxyCallable, sesId=739fef0c061-ad77f485-d677-4694-8d0c-e780f16be9b7, st artTime=1515078026015, endTime=1515078036021, taskNodeId=ad77f485-d677-4694-8d0c-e780f16be9b7, clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6, closed=false, cpSpi=null, failSpi=null, loadSpi=null, usage=1, fullSup=false, internal=false , subjId=ad77f485-d677-4694-8d0c-e780f16be9b7, mapFut=IgniteFuture [orig=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, hash=360535376]], execName=null], jobId=839fef0c061-ad77f485-d677-4694-8d0c-e780f16be9b7]] org.apache.ignite.IgniteCheckedException: Failed to deserialize object [typeName=org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2] at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:9859) [logic.jar:1.1.0] at org.apache.ignite.internal.processors.job.GridJobWorker.initialize(GridJobWorker.java:438) [logic.jar:1.1.0] at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1109) [logic.jar:1.1.0] at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1913) [logic.jar:1.1.0] at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555) [logic.jar:1.1.0] at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183) [logic.jar:1.1.0] at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) [logic.jar:1.1.0] at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090) [logic.jar:1.1.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121] Caused by: org.apache.ignite.binary.BinaryObjectException: Failed to deserialize object [typeName=org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2] at org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874) ~[logic.jar:1.1.0] at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1762) ~[logic.jar:1.1.0] at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1714) ~[logic.jar:1.1.0] at org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:310) ~[logic.jar:1.1.0] at org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:99) ~[logic.jar:1.1.0] at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:82) ~[logic.jar:1.1.0] at org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:9853) [logic.jar:1.1.0] ... 10 more Caused by: org.apache.ignite.binary.BinaryObjectException: Failed to unmarshal object with optimized marshaller at org.apache.ignite.internal.binary.BinaryUtils.doReadOptimized(BinaryUtils.java:1786) ~[logic.jar:1.1.0] at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1962) ~[logic.jar:1.1.0] at
[jira] [Created] (IGNITE-7083) Reduce memory usage of CachePartitionFullCountersMap
Sunny Chan created IGNITE-7083: -- Summary: Reduce memory usage of CachePartitionFullCountersMap Key: IGNITE-7083 URL: https://issues.apache.org/jira/browse/IGNITE-7083 Project: Ignite Issue Type: Improvement Components: cache Affects Versions: 2.3 Environment: Any Reporter: Sunny Chan The Cache Partition Exchange Manager kept a copy of the already completed exchange. However, we have found that it uses a significant amount of memory. Upon further investigation using heap dump we have found that a large amount of memory is used by the CachePartitionFullCountersMap. We have also observed in most cases, these maps contains only 0s. Therefore I propose an optimization for this: Initially the long arrays to store initial update counter and update counter in the CPFCM will be null, and when you get the value and see these tables are null then we will return 0 for the counter. We only allocate the long arrays when there is any non-zero updates to the the map. In our tests, the amount of heap used by GridCachePartitionExchangeManager was around 70MB (67 copies of these CPFCM), after we apply the optimization it drops to around 9MB. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6252) Cassandra Cache Store Session does not retry if prepare statement failed
Sunny Chan created IGNITE-6252: -- Summary: Cassandra Cache Store Session does not retry if prepare statement failed Key: IGNITE-6252 URL: https://issues.apache.org/jira/browse/IGNITE-6252 Project: Ignite Issue Type: Bug Components: cassandra Affects Versions: 2.1, 2.0 Reporter: Sunny Chan During our testing, we have found that certain warning about prepared statement: 2017-08-31 11:27:19.479 org.apache.ignite.cache.store.cassandra.CassandraCacheStore flusher-0-#265%% WARN CassandraCacheStore - Prepared statement cluster error detected, refreshing Cassandra session com.datastax.driver.core.exceptions.InvalidQueryException: Tried to execute unknown prepared query : 0xc7647611fd755386ef63478ee7de577b. You may have used a PreparedStatement that was created with another Cluster instance. We notice that after this warning occurs some of the data didn't persist properly in cassandra cache. After further examining the Ignite's CassandraSessionImpl code in method execute(BatchExecutionAssistance,Iterable), we found that at around [line 283|https://github.com/apache/ignite/blob/86bd544a557663bce497134f7826be6b24d53330/modules/cassandra/store/src/main/java/org/apache/ignite/cache/store/cassandra/session/CassandraSessionImpl.java#L283], if the prepare statement fails in the asnyc call, it will not retry the operation as the error is stored in [line 269|https://github.com/apache/ignite/blob/86bd544a557663bce497134f7826be6b24d53330/modules/cassandra/store/src/main/java/org/apache/ignite/cache/store/cassandra/session/CassandraSessionImpl.java#L269] and cleared in [line 277|https://github.com/apache/ignite/blob/86bd544a557663bce497134f7826be6b24d53330/modules/cassandra/store/src/main/java/org/apache/ignite/cache/store/cassandra/session/CassandraSessionImpl.java#L277] but it was not checked again after going through the [ResultSetFuture |https://github.com/apache/ignite/blob/86bd544a557663bce497134f7826be6b24d53330/modules/cassandra/store/src/main/java/org/apache/ignite/cache/store/cassandra/session/CassandraSessionImpl.java#L307]. I believe in line 307 you should check for error != null such that any failure will be retry. Also potentially in line 312 we will need to check isTableAbsenceError(error). -- This message was sent by Atlassian JIRA (v6.4.14#64029)