Hi Andrey,
I have attached the log. thanks.
Thanks.
On 24 February 2017 at 18:16, Andrey Mashenkov <[email protected]>
wrote:
> Hi Anil,
>
> Would you please provide ignite logs as well?
>
>
> On Fri, Feb 24, 2017 at 3:33 PM, Andrey Gura <[email protected]> wrote:
>
>> Hi, Anil
>>
>> Could you please provide crash dump? In your case it is
>> /opt/ignite-manager/api/hs_err_pid18543.log file.
>>
>> On Fri, Feb 24, 2017 at 9:05 AM, Anil <[email protected]> wrote:
>> > Hi ,
>> >
>> > I see the node is down with following error while running compute task
>> >
>> >
>> > # A fatal error has been detected by the Java Runtime Environment:
>> > #
>> > # SIGSEGV (0xb) at pc=0x00007facd5cae561, pid=18543,
>> tid=0x00007fab8a9ea700
>> > #
>> > # JRE version: OpenJDK Runtime Environment (8.0_111-b14) (build
>> > 1.8.0_111-8u111-b14-3~14.04.1-b14)
>> > # Java VM: OpenJDK 64-Bit Server VM (25.111-b14 mixed mode linux-amd64
>> > compressed oops)
>> > # Problematic frame:
>> > # J 8676 C2
>> > org.apache.ignite.internal.processors.query.h2.opt.GridH2Key
>> ValueRowOffheap.getOffheapValue(I)Lorg/h2/value/Value;
>> > (290 bytes) @ 0x00007facd5cae561 [0x00007facd5cae180+0x3e1]
>> > #
>> > # Failed to write core dump. Core dumps have been disabled. To enable
>> core
>> > dumping, try "ulimit -c unlimited" before starting Java again
>> > #
>> > # An error report file with more information is saved as:
>> > # /opt/ignite-manager/api/hs_err_pid18543.log
>> > #
>> > # If you would like to submit a bug report, please visit:
>> > # http://bugreport.java.com/bugreport/crash.jsp
>> > #
>> >
>> >
>> > I have two 2 caches on 4 node cluster each cache is configured with 10
>> gb
>> > off heap.
>> >
>> > ComputeTask performs the following execution and it is broad casted to
>> all
>> > nodes.
>> >
>> > for (Integer part : parts) {
>> > ScanQuery<String, Person> scanQuery = new ScanQuery<String, Person>();
>> > scanQuery.setLocal(true);
>> > scanQuery.setPartition(part);
>> >
>> > Iterator<Cache.Entry<String, Person>> iterator =
>> > cache.query(scanQuery).iterator();
>> >
>> > while (iterator.hasNext()) {
>> > Cache.Entry<String, Person> row = iterator.next();
>> > String eqId = row.getValue().getEqId();
>> > try {
>> > QueryCursor<Entry<AffinityKey<String>, Contract>> pdCursor =
>> > detailsCache.query(new SqlQuery<AffinityKey<String>,
>> > PersonDetail>(PersonDetail.class,
>> > "select * from DETAIL_CACHE.PersonDetail where eqId = ? order by enddate
>> > desc").setLocal(true).setArgs(eqId));
>> > Long prev = null;
>> > for (Entry<AffinityKey<String>, PersonDetail> d : pdCursor) {
>> > // populate person info into person detail
>> > dataStreamer.addData(new AffinityKey<String>(detaildId, eqId),
>> > d);
>> > }
>> > pdCursor.close();
>> > }catch (Exception ex){
>> > }
>> > }
>> >
>> > }
>> >
>> >
>> > Please let me know if you see any issues with approach or any
>> > configurations.
>> >
>> > Thanks.
>> >
>>
>
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>
2017-02-23 19:34:52 610 INFO CacheUpdateService:262 -
********************before ignite started*************************************
2017-02-23 19:34:52 763 WARN NoopCheckpointSpi:480 - Checkpoints are disabled
(to enable configure any GridCheckpointSpi implementation)
2017-02-23 19:34:52 808 WARN GridCollisionManager:480 - Collision resolution
is disabled (all jobs will be activated upon arrival).
2017-02-23 19:34:53 153 WARN TcpDiscoverySpi:480 - Failure detection timeout
will be ignored (one of SPI parameters has been set explicitly)
2017-02-23 19:34:54 996 INFO CacheUpdateService:264 -
********************After ignite started********************
2017-02-23 19:34:55 220 WARN VerifiableProperties:83 - Property
enable.auto.commit is not valid
2017-02-23 19:34:55 260 INFO ZkEventThread:64 - Starting ZkClient event thread.
2017-02-23 19:34:55 820 INFO CacheUpdateService:107 - Kafka is connected
successfully
2017-02-23 19:34:56 795 DEBUG ApplicationLauncher:57 - Vertx started as a
cluster
2017-02-23 19:34:56 926 DEBUG ApplicationLauncher:70 - RestControllerVerticle
deployed successfully
2017-02-23 19:34:56 931 WARN RestControllerVerticle:47 -
RestControllerVerticle server listening to ...8098
2017-02-23 19:34:57 547 DEBUG ApplicationLauncher:92 -
CacheManagerWorkerVerticle deployed successfully
2017-02-23 20:58:56 428 WARN GridCachePartitionExchangeManager:480 - Found
long running transaction [startTime=20:57:01.468, curTime=20:58:56.408,
tx=GridDhtTxRemote [nearNodeId=70756eaa-28d2-442b-8ec7-f3dbcc7f4868,
rmtFutId=2cbc26e6a51-12cc4105-7f9b-4310-9bc9-d61f36e6f7a4,
nearXidVer=GridCacheVersion [topVer=99387503, time=1487912418499,
order=1487960256302, nodeOrder=3], super=GridDistributedTxRemoteAdapter
[explicitVers=null, started=true, commitAllowed=0,
txState=IgniteTxRemoteSingleStateImpl [entry=IgniteTxEntry
[key=KeyCacheObjectImpl [val=BinaryMetadataKey [typeId=-1063706775],
hasValBytes=true], cacheId=-2100569601, partId=-1, txKey=IgniteTxKey
[key=KeyCacheObjectImpl [val=BinaryMetadataKey [typeId=-1063706775],
hasValBytes=true], cacheId=-2100569601], val=[op=TRANSFORM, val=null],
prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null],
entryProcessorsCol=[IgniteBiTuple [val1=MetadataProcessor
[newMeta=BinaryMetadata [typeId=-1063706775,
typeName=net.juniper.cs.cache.loader.AddInstallBaseToContractCallable,
fields=null, affKeyFieldName=null, isEnum=false]],
val2=[Ljava.lang.Object;@1d78bc6b]], ttl=-1, conflictExpireTime=-1,
conflictVer=null, explicitVer=null, dhtVer=null, filters=[],
filtersPassed=false, filtersSet=false, entry=GridDhtColocatedCacheEntry
[super=GridDhtCacheEntry [rdrs=[], locPart=GridDhtLocalPartition [id=75,
map=o.a.i.i.processors.cache.GridCacheConcurrentMapImpl@2619095e,
rmvQueue=GridCircularBuffer [sizeMask=127, idxGen=0], cntr=0,
shouldBeRenting=false, state=OWNING, reservations=0, empty=false,
createTime=02/23/2017 19:34:54], super=GridDistributedCacheEntry
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=BinaryMetadataKey
[typeId=-1063706775], hasValBytes=true], val=null, startVer=1487960256288,
ver=GridCacheVersion [topVer=99387503, time=1487912418484, order=1487960256288,
nodeOrder=2], hash=-1063706775, extras=GridCacheMvccEntryExtras
[mvcc=GridCacheMvcc [locs=null, rmts=[GridCacheMvccCandidate
[nodeId=70756eaa-28d2-442b-8ec7-f3dbcc7f4868, ver=GridCacheVersion
[topVer=99387503, time=1487912418499, order=1487960256302, nodeOrder=3],
timeout=0, ts=1487912221468, threadId=4061, id=69,
topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], reentry=null,
otherNodeId=70756eaa-28d2-442b-8ec7-f3dbcc7f4868, otherVer=null,
mappedDhtNodes=null, mappedNearNodes=null, ownerVer=null, serOrder=null,
key=KeyCacheObjectImpl [val=BinaryMetadataKey [typeId=-1063706775],
hasValBytes=true],
masks=local=0|owner=0|ready=0|reentry=0|used=0|tx=1|single_implicit=0|dht_local=0|near_local=0|removed=0,
prevVer=null, nextVer=null]]]], flags=0]]]], prepared=1, locked=false,
nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0,
partUpdateCntr=0, serReadVer=null, xidVer=null]], super=IgniteTxAdapter
[xidVer=GridCacheVersion [topVer=99387503, time=1487912418499,
order=1487960256302, nodeOrder=3], writeVer=GridCacheVersion [topVer=99387503,
time=1487912418500, order=1487960256304, nodeOrder=3], implicit=false,
loc=false, threadId=4061, startTime=1487912221468,
nodeId=70756eaa-28d2-442b-8ec7-f3dbcc7f4868, startVer=GridCacheVersion
[topVer=99387503, time=1487912408693, order=1487960256287, nodeOrder=2],
endVer=null, isolation=READ_COMMITTED, concurrency=OPTIMISTIC, timeout=0,
sysInvalidate=false, sys=true, plc=5, commitVer=null, finalizing=NONE,
preparing=false, invalidParts=null, state=PREPARED, timedOut=false,
topVer=AffinityTopologyVersion [topVer=14, minorTopVer=1], duration=114940ms,
onePhaseCommit=false]]]]
2017-02-23 20:58:56 431 WARN GridCachePartitionExchangeManager:480 - Found
long running cache operations, dump IO statistics.
2017-02-23 20:58:56 434 WARN TcpCommunicationSpi:480 - Communication SPI
recovery descriptors:
[key=ClientKey [nodeId=7c4eed02-11b9-48ec-831a-e1211b58e040, order=4],
msgsSent=5700, msgsAckedByRmt=5700, msgsRcvd=5708, lastAcked=5708,
reserveCnt=1, descIdHash=240963331]
[key=ClientKey [nodeId=52d16cff-f715-4874-bc12-d512e44107cc, order=1],
msgsSent=5751, msgsAckedByRmt=5751, msgsRcvd=5764, lastAcked=5764,
reserveCnt=2, descIdHash=868249250]
[key=ClientKey [nodeId=70756eaa-28d2-442b-8ec7-f3dbcc7f4868, order=3],
msgsSent=4123, msgsAckedByRmt=4123, msgsRcvd=4121, lastAcked=4121,
reserveCnt=4, descIdHash=536777179]
[key=ClientKey [nodeId=cdc2c02b-ba68-46db-9e94-dd8b57f063b7, order=9],
msgsSent=37, msgsAckedByRmt=37, msgsRcvd=35, lastAcked=35, reserveCnt=1,
descIdHash=812257513]
[key=ClientKey [nodeId=e87a4f59-f0f9-43c8-97ef-26410b8d462d, order=12],
msgsSent=36, msgsAckedByRmt=36, msgsRcvd=34, lastAcked=34, reserveCnt=2,
descIdHash=1487336752]
Communication SPI clients:
[node=70756eaa-28d2-442b-8ec7-f3dbcc7f4868,
client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl
[selectorIdx=2, queueSize=0, writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768
cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
recovery=GridNioRecoveryDescriptor [acked=4123, resendCnt=0, rcvCnt=4121,
sentCnt=4123, reserved=true, lastAck=4121, nodeLeft=false,
node=TcpDiscoveryNode [id=70756eaa-28d2-442b-8ec7-f3dbcc7f4868,
addrs=[0:0:0:0:0:0:0:1%lo, X.X.X.188, 127.0.0.1],
sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /X.X.X.188:47500],
discPort=47500, order=3, intOrder=3, lastExchangeTime=1487907293360, loc=false,
ver=1.8.0#20161205-sha1:9ca40dbe, isClient=false], connected=true,
connectCnt=2, queueLimit=5120, reserveCnt=4], super=GridNioSessionImpl
[locAddr=/X.X.X.187:40444, rmtAddr=/X.X.X.188:47100, createTime=1487911075531,
closeTime=0, bytesSent=391311567, bytesRcvd=94164, sndSchedTime=1487912330800,
lastSndTime=1487912236256, lastRcvTime=1487912330800, readsPaused=false,
filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=o.a.i.i.util.nio.GridDirectParser@7d28aa0c, directMode=true],
GridConnectionBytesVerifyFilter], accepted=false]],
super=GridAbstractCommunicationClient [lastUsed=1487911075531, reserves=0]]]
2017-02-23 20:58:56 434 WARN TcpCommunicationSpi:480 -
>> Selector info [idx=0, keysCnt=0]
2017-02-23 20:58:56 437 WARN TcpCommunicationSpi:480 -
>> Selector info [idx=1, keysCnt=0]
2017-02-23 20:58:56 440 WARN TcpCommunicationSpi:480 -
>> Selector info [idx=3, keysCnt=0]
2017-02-23 20:58:56 441 WARN TcpCommunicationSpi:480 -
>> Selector info [idx=2, keysCnt=1]
Connection info [rmtAddr=/X.X.X.188:47100, locAddr=/X.X.X.187:40444,
msgsSent=4123, msgsAckedByRmt=4123, msgsRcvd=4121, descIdHash=536777179,
bytesRcvd=94164, bytesSent=391311567, opQueueSize=0,
msgWriter=DirectMessageWriter [state=DirectMessageState [pos=0,
stack=[StateItem [stream=DirectByteBufferStreamImplV2
[buf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
baseOff=140365536188048, arrOff=-1, tmpArrOff=0, tmpArrBytes=0,
msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1, keyDone=false,
readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0, uuidMost=0,
uuidLeast=0, uuidLocId=0, lastFinished=true], state=0, hdrWritten=false],
StateItem [stream=DirectByteBufferStreamImplV2
[buf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
baseOff=140365536188048, arrOff=-1, tmpArrOff=0, tmpArrBytes=0,
msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1, keyDone=false,
readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0, uuidMost=0,
uuidLeast=0, uuidLocId=0, lastFinished=true], state=0, hdrWritten=false],
StateItem [stream=DirectByteBufferStreamImplV2
[buf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
baseOff=140365536188048, arrOff=-1, tmpArrOff=0, tmpArrBytes=0,
msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1, keyDone=false,
readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0, uuidMost=0,
uuidLeast=0, uuidLocId=0, lastFinished=true], state=0, hdrWritten=false],
StateItem [stream=DirectByteBufferStreamImplV2
[buf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
baseOff=140365536188048, arrOff=-1, tmpArrOff=0, tmpArrBytes=0,
msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1, keyDone=false,
readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0, uuidMost=0,
uuidLeast=0, uuidLocId=0, lastFinished=true], state=0, hdrWritten=false],
StateItem [stream=DirectByteBufferStreamImplV2
[buf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
baseOff=140365536188048, arrOff=-1, tmpArrOff=0, tmpArrBytes=0,
msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1, keyDone=false,
readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0, uuidMost=0,
uuidLeast=0, uuidLocId=0, lastFinished=true], state=0, hdrWritten=false], null,
null, null, null, null]]], msgReader=DirectMessageReader
[state=DirectMessageState [pos=0, stack=[StateItem
[stream=DirectByteBufferStreamImplV2 [buf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], baseOff=140365536220832, arrOff=-1, tmpArrOff=0,
tmpArrBytes=0, msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1,
keyDone=false, readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0,
uuidMost=0, uuidLeast=0, uuidLocId=0, lastFinished=true], state=0], StateItem
[stream=DirectByteBufferStreamImplV2 [buf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], baseOff=140365536220832, arrOff=-1, tmpArrOff=0,
tmpArrBytes=0, msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1,
keyDone=false, readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0,
uuidMost=0, uuidLeast=0, uuidLocId=0, lastFinished=true], state=0], StateItem
[stream=DirectByteBufferStreamImplV2 [buf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], baseOff=140365536220832, arrOff=-1, tmpArrOff=0,
tmpArrBytes=0, msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1,
keyDone=false, readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0,
uuidMost=0, uuidLeast=0, uuidLocId=0, lastFinished=true], state=0], StateItem
[stream=DirectByteBufferStreamImplV2 [buf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], baseOff=140365536220832, arrOff=-1, tmpArrOff=0,
tmpArrBytes=0, msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1,
keyDone=false, readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0,
uuidMost=0, uuidLeast=0, uuidLocId=0, lastFinished=true], state=0], null, null,
null, null, null, null]], lastRead=true]]
2017-02-23 20:59:01 647 INFO AddDetailsCallable:60 - ************* Staring the
compute job *****************
2017-02-23 20:59:01 652 INFO AddDetailsCallable:86 - processing the partition
0
2017-02-23 20:59:07 929 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 0 is 6277
2017-02-23 20:59:07 929 INFO AddDetailsCallable:86 - processing the partition
2
2017-02-23 20:59:14 362 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 2 is 6433
2017-02-23 20:59:14 363 INFO AddDetailsCallable:86 - processing the partition
3
2017-02-23 20:59:21 353 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 3 is 6990
2017-02-23 20:59:21 354 INFO AddDetailsCallable:86 - processing the partition
11
2017-02-23 20:59:33 353 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 11 is 11999
2017-02-23 20:59:33 354 INFO AddDetailsCallable:86 - processing the partition
16
2017-02-23 20:59:46 951 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 16 is 13597
2017-02-23 20:59:46 952 INFO AddDetailsCallable:86 - processing the partition
17
2017-02-23 21:00:01 214 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 17 is 14261
2017-02-23 21:00:01 214 INFO AddDetailsCallable:86 - processing the partition
18
2017-02-23 21:00:15 514 INFO AddDetailsCallable:226 - **** Time taken to
process the partition 18 is 14300
2017-02-23 21:00:15 515 INFO AddDetailsCallable:86 - processing the partition
21
2017-02-23 21:00:21 008 WARN IgniteH2Indexing:480 - Query execution is too
long [time=3960 ms, sql='SELECT