Re: What about append in hadoop files ?

Doug Cutting Fri, 14 Jul 2006 00:02:39 -0700

Thomas FRIOL wrote:

I would like to know today why it is not possible to append datas intoan existing file (Path) or why the FSDataOutputStream must be closedbefore the file is written to the DFS.

Those are the current semantics of the filesytem: a file is not readableuntil it is closed, and files are write-once. This considerablysimplifies the implementation and supports the primary intended uses forDFS. The simpler we keep DFS the easier it is to make it reliable andscalable. At this point we are prioritizing reliability and scalabilityover new features. Over time, when reliability and scalability aresufficiently demonstrated, these restrictions may be removed.

In fact, my problem is that I have a servlet which is regularly writingdatas into a file in the DFS. Today, if my JVM crashes, I lose all mydatas because my output stream is closed only when the JVM stops itself.


You could periodically close the file and start writing a new file.

DFS is currently primarily used to support large, offline, batchcomputations. For example, a log of critical data with tighttransactional requirements is probably an inappropriate use of DFS atthis time. Again, this may change, but that's where we are now.


Doug

Re: What about append in hadoop files ?

Reply via email to