Re: How to update a file which is in HDFS

Mohammad Tariq Fri, 05 Jul 2013 19:15:18 -0700

I totally agree harsh. It was just to avoid any misinterpretation :). I
have seen quite a few discussions as well that talk about the issues.


I would strongly recommend to switch from 1.x if append is desired.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Sat, Jul 6, 2013 at 7:29 AM, Harsh J <[email protected]> wrote:

> The append in 1.x is very broken. You'll run into very weird states
> and we officially do not support it (we even call out in the config as
> broken). I wouldn't recommend using it even if a simple test appears
> to work.
>
> On Sat, Jul 6, 2013 at 6:27 AM, Mohammad Tariq <[email protected]> wrote:
> > @Robin East :  Thank you for keeping me updated. I was on 1.0.3 when I
> had
> > tried append last time and it was not working despite of the fact that
> API
> > had it. I tried it with 1.1.2 and it seems to work fine.
> >
> > @Manickam : Apologies for the incorrect info. Latest stable
> release(1.1.2)
> > supports append. But, you should consider whatever Harsh has said.
> >
> > Warm Regards,
> > Tariq
> > cloudfront.blogspot.com
> >
> >
> > On Fri, Jul 5, 2013 at 4:24 PM, Harsh J <[email protected]> wrote:
> >>
> >> If it is 1k new records at the "end of the file" then you may extract
> >> them out and append the existing file in HDFS. I'd recommend using
> >> HDFS from Apache Hadoop 2.x for this purpose.
> >>
> >> On Fri, Jul 5, 2013 at 4:22 PM, Manickam P <[email protected]>
> wrote:
> >> > Hi,
> >> >
> >> > Let me explain the question clearly. I have a file which has one
> million
> >> > records and i moved into my hadoop cluster.
> >> > After one month i got a new file which has same one million plus 1000
> >> > new
> >> > records added in end of the file.
> >> > Here i just want to move the 1000 records alone into HDFS instead of
> >> > overwriting the entire file.
> >> >
> >> > Can i use HBase for this scenario? i don't have clear idea about
> HBase.
> >> > Just
> >> > asking.
> >> >
> >> >
> >> >
> >> >
> >> > Thanks,
> >> > Manickam P
> >> >
> >> >
> >> >> From: [email protected]
> >> >> Date: Fri, 5 Jul 2013 16:13:16 +0530
> >> >
> >> >> Subject: Re: How to update a file which is in HDFS
> >> >> To: [email protected]
> >> >
> >> >>
> >> >> The answer to the "delta" part is more that HDFS does not presently
> >> >> support random writes. You cannot alter a closed file for anything
> >> >> other than appending at the end, which I doubt will help you if you
> >> >> are also receiving updates (it isn't clear from your question what
> >> >> this added data really is).
> >> >>
> >> >> HBase sounds like something that may solve your requirement though,
> >> >> depending on how much of your read/write load is random. You could
> >> >> consider it.
> >> >>
> >> >> P.s. HBase too doesn't use the append() APIs today (and doesn't need
> >> >> it either). AFAIK, only Flume's making use of it, if you allow it to.
> >> >>
> >> >> On Thu, Jul 4, 2013 at 5:17 PM, Mohammad Tariq <[email protected]>
> >> >> wrote:
> >> >> > Hello Manickam,
> >> >> >
> >> >> > Append is currently not possible.
> >> >> >
> >> >> > Warm Regards,
> >> >> > Tariq
> >> >> > cloudfront.blogspot.com
> >> >> >
> >> >> >
> >> >> > On Thu, Jul 4, 2013 at 4:40 PM, Manickam P <[email protected]
> >
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> I have moved my input file into the HDFS location in the cluster
> >> >> >> setup.
> >> >> >> Now i got a new set of file which has some new records along with
> >> >> >> the
> >> >> >> old
> >> >> >> one.
> >> >> >> I want to move the delta part alone into HDFS because it will take
> >> >> >> more
> >> >> >> time to move the file from my local to HDFS location.
> >> >> >> Is it possible or do i need to move the entire file into HDFS
> again?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Manickam P
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Harsh J
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: How to update a file which is in HDFS

Reply via email to