[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968204#comment-14968204
 ] 

Ian Maxon commented on ASTERIXDB-1146:
--------------------------------------

This feels like a silly question,  but, did you try playing with 
"storage.buffercache.maxopenfiles" in the asterix-configuration.xml? We have it 
set to a really absurdly high value by default, and the buffer cache is lazy 
and not eager about closing file handles. If it's not set to something that 
will never cause the NC to break whatever the real limit is in ulimit, this can 
easily happen. 

> Cleaning up left-overs once the query is done  - External datasets
> ------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1146
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1146
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: AsterixDB
>            Reporter: Pouria
>            Assignee: Abdullah Alamoudi
>            Priority: Minor
>
> Running queries which use external datasets for a long time, without 
> restarting the asterixdb instance in between, causes the number of open files 
> grow, and it can eventually break the system.
> Using 'lsof' command , it seems that there are left-over 'ESTABLISHED' TCP 
> connections between NC and Datanode:
> java       7576    pouria  292u     IPv4          393409691       0t0        
> TCP asterix-10.ics.uci.edu:46965->asterix-10.ics.uci.edu:50010 (CLOSE_WAIT)
> java       7576    pouria  293u     IPv4          393412907       0t0        
> TCP asterix-10.ics.uci.edu:47005->asterix-10.ics.uci.edu:50010 (ESTABLISHED)
> …
> java      32205    pouria  576u     IPv4          393415126       0t0        
> TCP asterix-10.ics.uci.edu:50010->asterix-10.ics.uci.edu:47056 (ESTABLISHED)
> java      32205    pouria  586u     IPv4          393414645       0t0        
> TCP asterix-10.ics.uci.edu:50010->asterix-10.ics.uci.edu:47045 (ESTABLISHED)
> …
> Here is the error upon system breakage from CC logs:
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> org.apache.hyracks.api.exceptions.HyracksDataException: 
> org.apache.hyracks.api.exceptions.HyracksDataException: java.io.IOException: 
> Too many open files
>         at 
> org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:45)
>         at org.apache.hyracks.control.nc.Task.run(Task.java:312)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: 
> org.apache.hyracks.api.exceptions.HyracksDataException: java.io.IOException: 
> Too many open files
>         at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:358)
>         at org.apache.hyracks.control.nc.Task.run(Task.java:290)
>         ... 3 more
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: 
> java.io.IOException: Too many open files
>         at 
> org.apache.hyracks.control.nc.io.IOManager.createWorkspaceFile(IOManager.java:171)
>         at 
> org.apache.hyracks.control.nc.io.WorkspaceFileFactory.createManagedWorkspaceFile(WorkspaceFileFactory.java:39)
>         at 
> org.apache.hyracks.control.nc.Joblet.createManagedWorkspaceFile(Joblet.java:262)
>         at 
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoin.buildWrite(OptimizedHybridHashJoin.java:332)
>         at 
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoin.spillPartition(OptimizedHybridHashJoin.java:311)
>         at 
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoin.processTuple(OptimizedHybridHashJoin.java:237)
>         at 
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoin.build(OptimizedHybridHashJoin.java:215)
>         at 
> org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$PartitionAndBuildActivityNode$1.nextFrame(OptimizedHybridHashJoinOperatorDescriptor.java:313)
>         at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:342)
>         ... 4 more
> Caused by: java.io.IOException: Too many open files
>         at java.io.UnixFileSystem.createFileExclusively(Native Method)
>         at java.io.File.createNewFile(File.java:1006)
>         at java.io.File.createTempFile(File.java:1989)
>         at 
> org.apache.hyracks.control.nc.io.IOManager.createWorkspaceFile(IOManager.java:169)
>         ... 12 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to