Re: [jira] Commented: (HADOOP-1700) Append to files in HDFS

Jim Kellerman Tue, 04 Sep 2007 22:47:02 -0700

+1!


On Tue, 2007-09-04 at 22:38 -0700, eric baldeschwieler (JIRA) wrote:
>     [ 
> https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524988
>  ] 
> 
> eric baldeschwieler commented on HADOOP-1700:
> ---------------------------------------------
> 
> Just wanted to pitch in some context...
> 
> Jim stated in the opening of this bug that a single client writing would be 
> enough to address this issue.  I agree.  But what we should be clearer about 
> is the ultimate desired semantics for readers.  I'd define success as having 
> a single client doing appends and flushes as desired (say per line in a log 
> file) and having multiple clients "tail -f" the file and see updates at a 
> reasonable rate, IE soon after each flush or every 64k bytes or so with less 
> than a seconds latency.
> 
> This would let us build systems that log directly into HDFS and have related 
> systems respond based on those log streams.
> 
> This is where I'd like to see us get with this issue.  Clearly getting there 
> involves getting a handle on all the stuff already discussed in this thread.  
> We also need to think carefully about the pipelining and protocol issues 
> involved in making this work.
> 
> We might want to break the protocol change issues into another discussion, 
> but I want to make sure we don't converge on solutions that will not work 
> considering fine grained "flushes".
> 
> > Append to files in HDFS
> > -----------------------
> >
> >                 Key: HADOOP-1700
> >                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
> >             Project: Hadoop
> >          Issue Type: New Feature
> >          Components: dfs
> >            Reporter: stack
> >
> > Request for being able to append to files in HDFS has been raised a couple 
> > of times on the list of late.   For one example, see 
> > http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
> >   Other mail describes folks' workarounds because this feature is lacking: 
> > e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 
> > (Later on this thread, Jim Kellerman re-raises the HBase need of this 
> > feature).  HADOOP-337 'DFS files should be appendable' makes mention of 
> > file append but it was opened early in the life of HDFS when the focus was 
> > more on implementing the basics rather than adding new features.  Interest 
> > fizzled.  Because HADOOP-337 is also a bit of a grab-bag -- it includes 
> > truncation and being able to concurrently read/write -- rather than try and 
> > breathe new life into HADOOP-337, instead, here is a new issue focused on 
> > file append.  Ultimately, being able to do as the google GFS paper 
> > describes -- having multiple concurrent clients 
 making 'Atomic Record Append' to a single file would be sweet but at least for 
a first cut at this feature, IMO, a single client appending to a single HDFS 
file letting the application manage the access would be sufficent.
> 
-- 
Jim Kellerman, Senior Engineer; Powerset
[EMAIL PROTECTED]

Re: [jira] Commented: (HADOOP-1700) Append to files in HDFS

Reply via email to