For 1, yes the data is written in chunks to hdfs if you are using the FiLe System API. The whole file is not first stored in memory.
For 2, I think you should anyway shouldn't rely on an exception or 'not-closing' the writer for cleaning up the partially written file. It is not a safe and recommended practice anyway. You should do the cleanup and free resources and delete anything that you want, explicitly for better visibility, control and robustness. Regards, Shahab On Mon, Aug 19, 2013 at 3:38 PM, Adeel Qureshi <[email protected]>wrote: > I have a servlet that receives files in a streaming fashion and our > original design was to receive the file in /tmp directory and then move it > to hdfs via an external process but that seems to add an additional (may be > unnecessary step). My question is if I receive files in a servlet as a post > request (file is in body of request) and I open a bufferedwriter on hdfs > then > > 1. are the files really written in a streaming fashion such that nothing > is held in memory because these are huge files and maintaining in memory > and then at the end sending the whole file to hdfs wont make sense > > 2. if for some reason we decide half way down the file to reject it and > not move it to hdfs, since it was being streamed do we have to remove the > file or simply because the write stream isnt closed or some exception is > thrown that it will be automatically cleaned by file system. > > Thanks > Adeel >
