Quanlong Huang created IMPALA-11234:
---------------------------------------

             Summary: impalad keeps reporting ShortCircuitCache slot release 
failures in heavy workload
                 Key: IMPALA-11234
                 URL: https://issues.apache.org/jira/browse/IMPALA-11234
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
            Reporter: Quanlong Huang


I keep seeing this error during a local perf test on my desktop machine:
{code:java}
E0410 07:04:10.691095   430 ShortCircuitCache.java:232] 
ShortCircuitCache(0x6e76c6a7): failed to release short-circuit shared memory 
slot Slot(slotIdx=0, shm=DfsClientShm(1effcf56a590fbc371938a368987f4e9)) by 
sending ReleaseShortCircuitAccessRequestProto to 
/var/lib/hadoop-hdfs/socket.31001.  Closing shared memory segment.
Java exception follows:
java.io.IOException: ERROR_INVALID: there is no shared memory segment 
registered with shmId 1effcf56a590fbc371938a368987f4e9
        at 
org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:214)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
 {code}
I can also find it in our Jenkins jobs, but it only happens in the data-loading 
phase. So I suspend it only happens in heavy workloads.

HDFS-14701 mentioned that this happens when the DataNode is stopped/restarted. 
But I didn't restart my HDFS cluster and I'm still able to see this error log.

It worth investigating if we are doing something wrong in short-circuit related 
stuffs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to