[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420264#comment-15420264
 ] 

Jianfeng Jia commented on ASTERIXDB-1535:
-----------------------------------------

It happens again. But this time I checked the log and saw some exceptions.
{code}
INFO: NO NEED TO NOTIFY JOB FINISH!
org.apache.hyracks.api.exceptions.HyracksException: Job failed on account of:
org.apache.hyracks.api.exceptions.HyracksDataException: 
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.nio.channels.ClosedChannelException

    at 
org.apache.hyracks.control.cc.job.JobRun.waitForCompletion(JobRun.java:212)
    at 
org.apache.hyracks.control.cc.work.WaitForJobCompletionWork$1.run(WaitForJobCompletionWork.java:48)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: 
org.apache.hyracks.api.exceptions.HyracksDataException: 
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.nio.channels.ClosedChannelException
    at 
org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:45)
    at org.apache.hyracks.control.nc.Task.run(Task.java:319)
    ... 3 more
Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: 
org.apache.hyracks.api.exceptions.HyracksDataException: 
java.nio.channels.ClosedChannelException
    at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:365)
    at org.apache.hyracks.control.nc.Task.run(Task.java:297)
    ... 3 more
Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: 
java.nio.channels.ClosedChannelException
    at org.apache.hyracks.control.nc.io.IOManager.syncRead(IOManager.java:175)
    at 
org.apache.hyracks.storage.common.buffercache.BufferCache.read(BufferCache.java:575)
    at 
org.apache.hyracks.storage.common.buffercache.BufferCache.pin(BufferCache.java:211)
    at 
org.apache.hyracks.storage.am.common.freepage.LinkedMetaDataPageManager.getFirstMetadataPage(LinkedMetaDataPageManager.java:376)
    at 
org.apache.hyracks.storage.am.common.impls.AbstractTreeIndex.activate(AbstractTreeIndex.java:188)
    at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndexFileManager.isValidTreeIndex(AbstractLSMIndexFileManager.java:83)
    at 
org.apache.hyracks.storage.am.lsm.common.impls.AbstractLSMIndexFileManager.cleanupAndGetValidFilesInternal(AbstractLSMIndexFileManager.java:114)
    at 
org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeFileManager.cleanupAndGetValidFiles(LSMBTreeFileManager.java:95)
    at 
org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.activate(LSMBTree.java:180)
    at 
org.apache.asterix.common.context.DatasetLifecycleManager.open(DatasetLifecycleManager.java:209)
    at 
org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.open(IndexDataflowHelper.java:116)
    at 
org.apache.asterix.runtime.operators.AsterixLSMPrimaryUpsertOperatorNodePushable.open(AsterixLSMPrimaryUpsertOperatorNodePushable.java:115)
    at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:341)
    ... 4 more
Caused by: java.nio.channels.ClosedChannelException
    at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:94)
    at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:673)
    at org.apache.hyracks.control.nc.io.IOManager.syncRead(IOManager.java:163)
    ... 16 more

{code}

I'm thinking it's not a cc problem, it should still be a NC problem as 
ASTERIXDB-1534

> CC stop answering query from 19002 RESTAPI port
> -----------------------------------------------
>
>                 Key: ASTERIXDB-1535
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1535
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: HTTP API
>         Environment: master
> commit a89fae64ac21fb8eefde79f79d2dbe1a0e54c364
> Date:   Wed Jul 6 07:58:55 2016 -0700
>            Reporter: Jianfeng Jia
>            Assignee: Ian Maxon
>         Attachments: cc.jstack
>
>
> The 8888/adminconsole showed that there are many pending jobs while the 
> ingestion and the query works fine in nc. 
> If this situation lasts longer enough, say 2 days, the 19002 API will stop 
> response any queries, while the web interface from 19001 port can still 
> answer the query.
> I need to restart the cluster to recover the service. Before that I record 
> the jstack log of the cc as attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to