Hi Ilya,

Please find attached the jstack dump for the 3 nodes.


Yesterday, I modified the java configuration of servers to pass the -Xms512m 
-Xmx8g arguments to jvm.


The cluster runs fine since yesterday evening 18:30 CET. The "Failed to execute 
local query." started to appear at 08:00 CET this morning after an update of 
@c0 items.

Several similar updates were made yesterday at 19:00, 20:00 and 22:00 CET 
without trouble.


Regards,

Stephane Gayet


________________________________
De : Ilya Kasnacheev <[email protected]>
Envoyé : jeudi 31 mai 2018 14:02:35
À : [email protected]
Objet : Re: SQL Query error

Hello!

It looks like that your cluster was in hung state (unable to perform partition 
map exchange when client node exited or entries topology) due to a stuck cache 
operation.

As witnessed by:
2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - 
Failed to wait for partition release future [topVer=AffinityTopologyVersion 
[topVer=70
, minorTopVer=0], node=646cb075-dd64-4a90-a5a8-23f3b97f4d36]
2018-05-31 12:10:37,781 [35] WARN 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture
 - Partition release futu
re: PartitionReleaseFuture [topVer=AffinityTopologyVersion [topVer=70, 
minorTopVer=0], futures=[ExplicitLockReleaseFuture 
[topVer=AffinityTopologyVersion [topVe
r=70, minorTopVer=0], futures=[]], TxReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=70, minorTopVer=0], futures=[]], 
AtomicUpdateReleaseFuture [topVer=Af
finityTopologyVersion [topVer=70, minorTopVer=0], 
futures=[GridDhtAtomicUpdateFuture [updateCntr=7, 
super=GridDhtAtomicAbstractUpdateFuture [futId=85, resCnt=0,
 addedReader=false, dhtRes={}]]]], DataStreamerReleaseFuture 
[topVer=AffinityTopologyVersion [topVer=70, minorTopVer=0], futures=[]]]]
...
2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - 
Pending atomic cache futures:
2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - >>> 
GridDhtAtomicUpdateFuture [updateCntr=7, 
super=GridDhtAtomicAbstractUpdateFuture [futId=85, resCnt=0, addedReader=false, 
dhtRes={}]]

Once you have killed one of the nodes, the operation was no longer considered 
stuck and your cluster un-hung.

It's hard to say why this operation failed to complete. I suggest you to 
collect Java thread dumps (with jstack) from all nodes on the nearest occassion 
when you notice that cluster is stuck again.

Regards,



--
Ilya Kasnacheev

2018-05-31 14:01 GMT+03:00 Stéphane Gayet 
<[email protected]<mailto:[email protected]>>:

Hi Ilya,


Thanks for your help.


Could you access to the files at 
https://drive.google.com/drive/folders/1apPraUn-Z2wKXFr5Wsdf8qHIQssW0XgC?usp=sharing


Regards,

________________________________
De : Ilya Kasnacheev 
<[email protected]<mailto:[email protected]>>
Envoyé : jeudi 31 mai 2018 12:32:35
À : [email protected]<mailto:[email protected]>
Objet : Re: SQL Query error

Hello!

Full Ignite logs of the problematic node will be helpful. Can you upload the 
log file anywhere?

Regards,

--
Ilya Kasnacheev

2018-05-31 12:37 GMT+03:00 Stéphane Gayet 
<[email protected]<mailto:[email protected]>>:

Hi All,


Corrections about my previous email.


When the cluster stops responding to the sql query, I can identify a faulting 
node in the following exception :

2018-05-31 11:20:17,036 [75] ERROR ServiceCache - Failed to get RtProposal

Apache.Ignite.Core.Cache.CacheException: Failed to run map query 
remotely.Failed to execute map query on the node: 
90bd677d-dee5-44bb-af6f-80786b85bd37, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals] ---> 
Apache.Ignite.Core.Common.JavaException: javax.cache.CacheException: Failed to 
run map query remotely.Failed to execute map query on the node: 
90bd677d-dee5-44bb-af6f-80786b85bd37, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals]

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:747)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.IgniteH2Indexing$8.iterator(IgniteH2Indexing.java:1339)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.platform.cache.query.PlatformAbstractQueryCursor.processInLongOutLong(PlatformAbstractQueryCursor.java:147)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.platform.PlatformTargetProxyImpl.inLongOutLong(PlatformTargetProxyImpl.java:55)

Caused by: javax.cache.CacheException: Failed to execute map query on the node: 
90bd677d-dee5-44bb-af6f-80786b85bd37, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals]

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.fail(GridReduceQueryExecutor.java:275)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.onFail(GridReduceQueryExecutor.java:265)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.onMessage(GridReduceQueryExecutor.java:244)

at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor$2.onMessage(GridReduceQueryExecutor.java:188)

at 
org.apache.ignite.internal.managers.communication.GridIoManager$ArrayListener.onMessage(GridIoManager.java:2332)

at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)

at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)

at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)

at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)


   à Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.ExceptionCheck()

   à Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.CallLongMethod(GlobalRef obj, 
IntPtr methodId, Int64* argsPtr)

   à 
Apache.Ignite.Core.Impl.Unmanaged.UnmanagedUtils.TargetInLongOutLong(GlobalRef 
target, Int32 opType, Int64 memPtr)

   à Apache.Ignite.Core.Impl.PlatformJniTarget.InLongOutLong(Int32 type, Int64 
val)

   --- Fin de la trace de la pile d'exception interne ---

   à Apache.Ignite.Core.Impl.PlatformJniTarget.InLongOutLong(Int32 type, Int64 
val)

   à Apache.Ignite.Core.Impl.Cache.Query.QueryCursorBase`1.GetEnumerator()

   à 
MrFly.CacheDlm.Common.Services.ServiceCache.BuildRtProposalFromOwProposal(String
 enterprise, String route1, String route2, DateTime departureDate, DateTime 
returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf) dans 
C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Services\ServiceCache.cs:ligne 1015

   à MrFly.CacheDlm.Common.Services.ServiceCache.BuildRtProposal(String 
enterprise, String route1, String route2, DateTime departureDate, DateTime 
returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf) dans 
C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Services\ServiceCache.cs:ligne 913

   à MrFly.CacheDlm.Common.Services.ServiceCache.GetRtProposal(String 
enterprise, String route1, String route2, DateTime departureDate, DateTime 
returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf, Boolean withCache) dans 
C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Services\ServiceCache.cs:ligne 646

If I stop Ignite on this node, the cluster starts responding again.


So my questions are : why the node stops responding? How to identify root 
causes?


Regards,

Stephane Gayet


________________________________
De : Stéphane Gayet 
<[email protected]<mailto:[email protected]>>
Envoyé : mercredi 30 mai 2018 23:44
À : [email protected]<mailto:[email protected]>
Objet : SQL Query error


Hi all,


We have installed a 2.4 Ignite cluster

- 3 nodes under Windows systems (24 Go, 16 Go, 16 Go)

- 4 caches configured, partitioned, no backup

- no persistence


cache @c0 contains around 60,000 items,

cache @c3 contains few items (around 200) but items are very large.


We run sql queries which aggregate the @c0 items in large collections (until 
14,000 items per collection) and store the result in @c3.


After a while, the sql query stop functionning. The following error is logged :


2018-05-30 23:14:42,716 [282] ERROR 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridMapQueryExecutor
 - Failed to execute local query.

Here is our cache configuration :
    <cacheConfiguration>
      <cacheConfiguration name="owproposals" cacheMode="Partitioned" 
backups="0" readThrough="true" writeThrough="true" writeBehindEnabled="true">
        <cacheStoreFactory type="OwProposalFactory"/>
        <queryEntities>
          <queryEntity valueType="OwProposal, Cache.Common"
                       valueTypeName="Models.OwProposal"/>
        </queryEntities>
      </cacheConfiguration>
      <cacheConfiguration name="rtproposals" cacheMode="Partitioned" 
backups="0" readThrough="true" writeThrough="true" writeBehindEnabled="true">
        <cacheStoreFactory type="RtProposalFactory"/>
        <queryEntities>
          <queryEntity valueType="RtProposal, Cache.Common"
                       valueTypeName="Models.RtProposal"/>
        </queryEntities>
      </cacheConfiguration>
      <cacheConfiguration name="owcollection" cacheMode="Partitioned" 
backups="0">
        <queryEntities>
          <queryEntity valueType="OwCollection, Cache.Common"
                       valueTypeName="Models.OwCollection"/>
        </queryEntities>
      </cacheConfiguration>
      <cacheConfiguration name="rtcollection" cacheMode="Partitioned" 
backups="0">
        <queryEntities>
          <queryEntity valueType="RtCollection, Cache.Common"
                       valueTypeName="Models.RtCollection"/>
        </queryEntities>
      </cacheConfiguration>
    </cacheConfiguration>

I tried to clear the items of @c0 cache before re-populate it but I got the 
error :

2018-05-30 23:34:34,333 [16] ERROR ServiceCache - Failed to delete OwItems 
older than 2018-05-31
Apache.Ignite.Core.Common.IgniteException: Failed to execute map query on the 
node: b9f240d6-0ee6-4dee-bda0-51088a743481, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals] ---> 
Apache.Ignite.Core.Common.JavaException: class 
org.apache.ignite.IgniteCheckedException: Failed to execute map query on the 
node: b9f240d6-0ee6-4dee-bda0-51088a743481, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals]
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.platform.utils.PlatformUtils.unwrapQueryException(PlatformUtils.java:519)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.platform.cache.query.PlatformAbstractQueryCursor.processOutStream(PlatformAbstractQueryCursor.java:132)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.platform.PlatformTargetProxyImpl.outStream(PlatformTargetProxyImpl.java:93)
Caused by: javax.cache.CacheException: Failed to run map query remotely.Failed 
to execute map query on the node: b9f240d6-0ee6-4dee-bda0-51088a743481, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals]
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:747)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.IgniteH2Indexing$8.iterator(IgniteH2Indexing.java:1339)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.IgniteH2Indexing$9.iterator(IgniteH2Indexing.java:1403)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:95)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.cache.QueryCursorImpl.getAll(QueryCursorImpl.java:127)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.platform.cache.query.PlatformAbstractQueryCursor.processOutStream(PlatformAbstractQueryCursor.java:127)
... 1 more
Caused by: javax.cache.CacheException: Failed to execute map query on the node: 
b9f240d6-0ee6-4dee-bda0-51088a743481, class 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.IgniteSQLException:Failed
 to set schema for DB connection for thread [schema=owproposals]
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.fail(GridReduceQueryExecutor.java:275)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.onFail(GridReduceQueryExecutor.java:265)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor.onMessage(GridReduceQueryExecutor.java:244)
at 
org.apache.ignite.internal.pro<http://org.apache.ignite.internal.pro>cessors.query.h2.twostep.GridReduceQueryExecutor$2.onMessage(GridReduceQueryExecutor.java:188)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$ArrayListener.onMessage(GridIoManager.java:2332)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

   à Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.ExceptionCheck()
   à Apache.Ignite.Core.Impl.Unmanaged.UnmanagedUtils.TargetOutStream(GlobalRef 
target, Int32 opType, Int64 memPtr)
   à Apache.Ignite.Core.Impl.PlatformJniTarget.OutStream[T](Int32 type, Func`2 
readAction)
   --- Fin de la trace de la pile d'exception interne ---
   à Apache.Ignite.Core.Impl.PlatformJniTarget.OutStream[T](Int32 type, Func`2 
readAction)
   à Apache.Ignite.Core.Impl.Cache.Query.QueryCursorBase`1.GetAll()
   à MrFly.CacheDlm.Common.Services.ServiceCache.GetKeys[TV](QueryBase query)
   à MrFly.CacheDlm.Common.Services.ServiceCache.DeleteOneWayItems(DateTime 
createdDate)

At this time, the only way is to down the three nodes and restart them from 
scratch.

Any idea about what is malfunctionning or misconfigured ?

Kind regards,

S Gayet






<<attachment: jstack.zip>>

Reply via email to