Eric,
I remember Doug advised somebody on a related issue to use a directory
instead of a file for long lasting appends.
You can logically divide your output into smaller files and close them
whenever the logical boundary is reached.
The directory can be treated as a collection of records. May be this
will work for you.
IMO the concurrent append feature is a high priority task.
--Konstantin
Doug Cutting wrote:
drwho wrote:
If so, GFS, is also suitable only for large, offline, batch
computations ?
I wonder how Google is going to use GFS for writely or their online
spreadsheet or their BigTable (their gigantic relational DB).
Did I say anything about GFS? I don't think so. Also, I said,
"currently" and "primarily", not "forever" and "exclusively". I would
love for DFS to be more suitable for online, incremental stuff, but
we're a ways from that right now. As I said, we're pursuing
reliability, scalability and performance before features like append.
If you'd like to try to implement append w/o disrupting work on
reliability scalability and performance, we'd welcome your
contributions. The project direction is determined by contributors.
Note that BigTable is a complex layer on top of GFS that caches and
batches i/o. So, while GFS does implement some features that DFS
still does not (like appends), GFS is probably not used directly by,
e.g., writely. Finally, BigTable is not relational.
Doug
Doug Cutting <[EMAIL PROTECTED]> wrote: <chopped>
DFS is currently primarily used to support large, offline, batch
computations. For example, a log of critical data with tight
transactional requirements is probably an inappropriate use of DFS at
this time. Again, this may change, but that's where we are now.
Doug
Thanks much.
-eric