On Sun, Jun 10, 2012 at 9:39 AM, Harsh J <ha...@cloudera.com> wrote: > Mohit, > > On Sat, Jun 9, 2012 at 11:11 PM, Mohit Anchlia <mohitanch...@gmail.com> > wrote: > > Thanks Harsh for detailed info. It clears things up. Only thing from > those > > page is concerning is what happens when client crashes. It says you could > > lose upto a block worth of information. Is this still true given that NN > > would auto close the file? > > Where does it say this exactly? It is true that immediate readers will > not get the last block (as it remains open and uncommitted), but once > the lease recovery kicks in the file is closed successfully and the > last block is indeed made available, so there's no 'data loss'. >
I saw it in "Coherency Model" -> "consequences of application design" paragraph. Thanks for the information. It at least helps me in that I don't have to worry about the data loss when sync is not closed. > > > Is it a good practice to reduce NN default value so that it auto-closes > > before 1 hr. > > I've not seen people do this/need to do this. Most don't run into such > a situation and it is vital to properly close() files or sync() on > file streams before making it available to readers. HBase manages open > files during WAL-recovery using lightweight recoverLease APIs that > were added for its benefit, so it doesn't need to wait for an hour for > WALs to close and recover data. > > -- > Harsh J >