Hi Thanh, There were some known bugs in the append feature in Apache Hadoop 0.19. These bugs were fixed in both Apache 0.20-append and Apache 0.21. For Cloudera's distribution, I have no idea. You may want to ask your questions in Cloudera's mailing lists.
I am sorry for the bugs. Regards, Nicholas ________________________________ From: Thanh Do <than...@cs.wisc.edu> To: Ted Dunning <tdunn...@maprtech.com> Cc: hdfs-u...@hadoop.apache.org; hdfs-dev@hadoop.apache.org Sent: Fri, April 15, 2011 8:02:05 AM Subject: Re: silent data loss during append I am using cloudera's distribution version: hadoop-0.20.2+738. On Thu, Apr 14, 2011 at 6:23 PM, Ted Dunning <tdunn...@maprtech.com> wrote: > What version are you using? > > > On Thu, Apr 14, 2011 at 3:55 PM, Thanh Do <than...@cs.wisc.edu> wrote: > >> Hi all, >> >> I have recently seen silent data loss in our system. >> Here is the case: >> >> 1. client appends to some block >> 2. for some reason, commitBlockSynchronization >> returns successfully with synclist = [] (i.e empty) >> 3. in the client code, NO exception is thrown, and >> client appends successfully. >> 4. However, the block replicas are then removed from >> datanodes, causing data loss. >> >> Have any one seen this before? >> Is this behavior by design or a bug? >> >> Many thanks, >> Thanh