[ 
https://issues.apache.org/jira/browse/DAFFODIL-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Beckerle updated DAFFODIL-2637:
------------------------------------
    Priority: Minor  (was: Major)

> Daffodil opening too many schema files
> --------------------------------------
>
>                 Key: DAFFODIL-2637
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2637
>             Project: Daffodil
>          Issue Type: Bug
>            Reporter: Steve Lawrence
>            Priority: Minor
>
> When running sbt test in a schema project (specifically P8) with about 100 
> schema files, each of which imports some common schema files (e.g. 
> common.dfdl.xsd, strings.dfdl.xsd) for shared use, Daffodil is holding open 
> hundreds of file descriptors to these same common files. In some cases, this 
> can hit a resource limit and causes failures with "Too many open file" errors.
> A good way to visualize this is start an sbt console:
> {code}
> sbt -mem 4096
> {code}
> Then in another terminal, find the process ID of the running sbt process:
> {code}
> pgrep -f sbt
> {code}
> Start this watch command to see the all the files that are open, including 
> the count of duplicates, replacing $PID with the pid you just found:
> {code}
> watch -n 0 'readlink /proc/$PID/fd/* | sort | uniq -c | sort -n -r'
> {code}
> Then trigger the tests to run back in the sbt console:
> {code}
> sbt> test
> {code}
> As the tests are run, the top of the watch screen will show many open file 
> descriptors to those common files, e.g.
> {code}
>     187 /path/to/common.dfdl.xsd
>      45 /path/to/strings.dfdl.xsd
>      ...
>       4 anon_inode:inotify
>       3 /dev/urandom
>       3 /dev/random
>       3 /dev/pts/0
>       1 socket:[2233641]
>       1 socket:[2233640]
> {code}
> In some cases, I've seen these common files get to over 1000 instances.
> It's not clear where this is happening, it could be schema compilation, it 
> should be somewhere in the TDML runner, maybe somewhere with validating 
> infosets, etc. We need to track down what is opening all these duplicate 
> files and see if it can be fixed so they can be shared. Or if that's 
> difficult, maybe we can at least close these files earlier so that we don't 
> have so many open file descriptors at once.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to