Re: Overwriting the same block instead of creating a new one

Todd Lipcon Tue, 22 Jun 2010 00:02:28 -0700

On Mon, Jun 21, 2010 at 10:42 PM, Vidur Goyal <[email protected]>wrote:


> Like in any other filesystem as ext4 , in case of overwrite why don't we
> update the existing physical memory. Why is there a need to allocate
> memory every time when an overwrite takes place. Isn't this a overhead.
>
>
I think the issue is that you are trying to apply principles from ext4 to
HDFS, which has very different goals. HDFS focuses on sequential IO, and
overwrites are very rare. So unless there's a huge gain to be had, the
simplest implementation will win every time.

-Todd


>
> > I know about the current behaviour of HDFS. I am proposing this new
> > behaviour which i mentioned in my first mail.
> >
> > In Hadoop-0.20.2 , a new block is allocated and stored at datanodes and a
> > new INode is created in namespace. Why is an overwrite considered as a
> > file creation operation.
> >
> > -vidur
> >> Hi Vidur,
> >>
> >> I'm not following. The "overwrite" flag causes the file to be
> >> overwritten
> >> starting at offset 0 - it doesn't allow you to retain any bit of the
> >> preexisting file. It's equivalent to a remove followed by a create.
> >> Think
> >> of
> >> it like O_TRUNC.
> >>
> >> -Todd
> >>
> >> On Mon, Jun 21, 2010 at 10:03 PM, Vidur Goyal
> >> <[email protected]>wrote:
> >>
> >>> Dear Todd,
> >>>
> >>> By truncating i meant removing unused *blocks* from the namespace and
> >>> let
> >>> them be garbage collected. There will be no truncation of the last
> >>> block(even if it is not full). This way , rather then garbage
> >>> collecting
> >>> all the blocks of a file , we will only be garbage collecting the
> >>> remaining blocks.
> >>>
> >>> -vidur
> >>>
> >>>
> >>> > HDFS assumes in hundreds of places that blocks never shrink. So,
> >>> there
> >>> is
> >>> > no
> >>> > option to truncate a block.
> >>> >
> >>> > -Todd
> >>> >
> >>> > On Mon, Jun 21, 2010 at 9:41 PM, Vidur Goyal
> >>> > <[email protected]>wrote:
> >>> >
> >>> >> Hi All,
> >>> >>
> >>> >> In FSNamesystem#startFileInternal , whenever there is a overwrite
> >>> flag
> >>> >> set
> >>> >> , why is the INode removed from the namespace and a new
> >>> >> INodeFileUnderConstruction is created. Why can't we use the convert
> >>> the
> >>> >> same INode to INodeFileUnderConstruction. And we start writing to
> >>> the
> >>> >> same
> >>> >> blocks at the same datanodes (after incrementing the GS) followed by
> >>> >> either truncating the remaining blocks(if the file size decreases)
> >>> or
> >>> >> allocating new blocks (if the file size increases). This will
> >>> decrease
> >>> >> data redundancy and the job of garbage collector and will increase
> >>> >> security.
> >>> >>
> >>> >> vidur
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> This message has been scanned for viruses and
> >>> >> dangerous content by MailScanner, and is
> >>> >> believed to be clean.
> >>> >>
> >>> >>
> >>> >
> >>> >
> >>> > --
> >>> > Todd Lipcon
> >>> > Software Engineer, Cloudera
> >>> >
> >>> > --
> >>> > This message has been scanned for viruses and
> >>> > dangerous content by MailScanner, and is
> >>> > believed to be clean.
> >>> >
> >>> >
> >>>
> >>>
> >>> --
> >>> This message has been scanned for viruses and
> >>> dangerous content by MailScanner, and is
> >>> believed to be clean.
> >>>
> >>>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >> --
> >> This message has been scanned for viruses and
> >> dangerous content by MailScanner, and is
> >> believed to be clean.
> >>
> >>
> >
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> >
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Overwriting the same block instead of creating a new one

Reply via email to