Hi All, I experience some memory retention while copying data into HDFS when a IOExeption is thrown.
My use case is the following: I have multiple threads sharing a FileSystem object, all uploading files. At some point quota is exceeded in one thread and I get a DSQuotaExceededException (subclass of IOException). In both regular case and when such exception is thrown, I'm closing the DFSOutputStream. But for DFSOutputStream that encountered a IOException, the last Packet is kept in memory until the FileSystem is closed. Which I usually don't close really often. So my questions: - Is this the expected behavior and need I to deal with ? - Is there a way to close properly a DFSOutputStream (and freeing all the retained memory) without closing the FileSystem ? - Is the usage of one shared FileSystem in several threads recommended ? Attached is a simple test reproducing the behavior: MiniDFSCluster is launched, a deadly small quota is set to have IOException thrown. Random content is generated and uploaded to hdfs. FileSystem is not closed, thus memory is growing till an OOM is thrown (don't blame me for the @Test(expected = OutOfMemoryError.class) :)). Tested on Hadoop 1.0.2. Thanks in advance for your answers, pointers and advises. Benoit.