> More than one fd can be open on a given file, and many of open fd's are > on files that have been deleted. The stale fd's are all on Data.db files in > the > data directory, which I have separate from the commit log directory. > > I haven't had a chance to look at the code handling files, and I am not any > sort of Java expert, but could this be due to Java's lazy resource clean up? > I wonder if when considering writing your own file handling classes for > O_DIRECT or posix_fadvise or whatever, an explicit close(2) might help.
The fact that there are open fds to deleted files is interesting... I wonder if people have reported weird disk space usage in the past (since such deleted files would not show up with 'du -sh' but eat space on the device until closed). My general understanding is that Cassandra does specifically rely on the GC to know when unused sstables can be removed. However the fact that the files are deleted I think means that this is not the problem, and the question is rather why open file descriptors/streams are leaking to these deleted sstables. But I'm speaking now without knowing when/where streams are closed. Are the deleted files indeed sstable, or was that a bad assumption on my part? -- / Peter Schuller