Eric,

I remember Doug advised somebody on a related issue to use a directory instead of a file for long lasting appends. You can logically divide your output into smaller files and close them whenever the logical boundary is reached. The directory can be treated as a collection of records. May be this will work for you.
IMO the concurrent append feature is a high priority task.

--Konstantin

Doug Cutting wrote:

drwho wrote:

If so, GFS, is also suitable only for large, offline, batch computations ?
I wonder how Google is going to use GFS for writely or their online
spreadsheet or their  BigTable (their gigantic relational DB).


Did I say anything about GFS? I don't think so. Also, I said, "currently" and "primarily", not "forever" and "exclusively". I would love for DFS to be more suitable for online, incremental stuff, but we're a ways from that right now. As I said, we're pursuing reliability, scalability and performance before features like append. If you'd like to try to implement append w/o disrupting work on reliability scalability and performance, we'd welcome your contributions. The project direction is determined by contributors.

Note that BigTable is a complex layer on top of GFS that caches and batches i/o. So, while GFS does implement some features that DFS still does not (like appends), GFS is probably not used directly by, e.g., writely. Finally, BigTable is not relational.

Doug

Doug Cutting <[EMAIL PROTECTED]> wrote: <chopped>

DFS is currently primarily used to support large, offline, batch computations. For example, a log of critical data with tight transactional requirements is probably an inappropriate use of DFS at this time. Again, this may change, but that's where we are now.

Doug




Thanks much.

-eric




Reply via email to