+1!
On Tue, 2007-09-04 at 22:38 -0700, eric baldeschwieler (JIRA) wrote: > [ > https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524988 > ] > > eric baldeschwieler commented on HADOOP-1700: > --------------------------------------------- > > Just wanted to pitch in some context... > > Jim stated in the opening of this bug that a single client writing would be > enough to address this issue. I agree. But what we should be clearer about > is the ultimate desired semantics for readers. I'd define success as having > a single client doing appends and flushes as desired (say per line in a log > file) and having multiple clients "tail -f" the file and see updates at a > reasonable rate, IE soon after each flush or every 64k bytes or so with less > than a seconds latency. > > This would let us build systems that log directly into HDFS and have related > systems respond based on those log streams. > > This is where I'd like to see us get with this issue. Clearly getting there > involves getting a handle on all the stuff already discussed in this thread. > We also need to think carefully about the pipelining and protocol issues > involved in making this work. > > We might want to break the protocol change issues into another discussion, > but I want to make sure we don't converge on solutions that will not work > considering fine grained "flushes". > > > Append to files in HDFS > > ----------------------- > > > > Key: HADOOP-1700 > > URL: https://issues.apache.org/jira/browse/HADOOP-1700 > > Project: Hadoop > > Issue Type: New Feature > > Components: dfs > > Reporter: stack > > > > Request for being able to append to files in HDFS has been raised a couple > > of times on the list of late. For one example, see > > http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193. > > Other mail describes folks' workarounds because this feature is lacking: > > e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 > > (Later on this thread, Jim Kellerman re-raises the HBase need of this > > feature). HADOOP-337 'DFS files should be appendable' makes mention of > > file append but it was opened early in the life of HDFS when the focus was > > more on implementing the basics rather than adding new features. Interest > > fizzled. Because HADOOP-337 is also a bit of a grab-bag -- it includes > > truncation and being able to concurrently read/write -- rather than try and > > breathe new life into HADOOP-337, instead, here is a new issue focused on > > file append. Ultimately, being able to do as the google GFS paper > > describes -- having multiple concurrent clients making 'Atomic Record Append' to a single file would be sweet but at least for a first cut at this feature, IMO, a single client appending to a single HDFS file letting the application manage the access would be sufficent. > -- Jim Kellerman, Senior Engineer; Powerset [EMAIL PROTECTED]