This I can not explain. Check blocks directory on the two servers. Maybe they were all under one datanode only. St.Ack
2011/7/27 吴限 <[email protected]>: > Thx for your reply. But actually later I did another experiment similar to > one which I explained earlier. > Step 1: I inserted some data into the hbase. > Step 2: I shut one of the region servers. > Step 3 : I checked the table and found some data had been lost. > Step 4: I disabled the table and then enabled the table > Step 5 : I checked again and found nothing lost. > > If some data didn't exist in the other region server, then how can u explain > this? > > Hope to get ur reply.Thx~ > > 2011/7/28 Chris Tarnas <[email protected]> > >> Replication of 1x means no replication. 2x would mean the data exists in >> two locations (what it looks like you want). Running with a replication of >> 1x is a very bad idea and is pretty much a guaranteed way to get data loss. >> >> -chris >> >> On Jul 27, 2011, at 8:58 AM, 吴限 wrote: >> >> > Hi everyone. I'd like to run the following *data* *loss* scenario by you >> to >> > see if >> > we are doing something obviously wrong with our setup here. >> > >> > Setup: >> > -cdh3u0 >> > - Hadoop 0.20.2 >> > - HBase 0.90.1 >> > - 1 Master Node running as NameNode & JobTracker >> > -zookeeper quorum >> > - 2 child nodes running as Datanode, TaskTracker and RegionServer each >> > - dfs.replication is set to 1 >> > >> > First, I inserted some data into the hbase a few hours ago. >> > Then after a while. I rebooted one of the region servers and waited until >> > the master responded to that. However, after I checked the table using >> hbase >> > shell (I used the "count" command), I noticed that there was a huge >> amount >> > of data being lost. >> > After I restarted the regionserver which I had rebooted and checked >> again, >> > I found that some of the missing data was got back but there still >> existed >> > some data which hadn't been found yet. >> > At last,after I disabled the table and then enabled the table , I found >> that >> > all data was stored in the cluster and there was no data that was lost. >> > >> > This is problematic since we are supposed to >> > replicate at x1, so at least one other node should be able to >> > theoretically serve the *data* that the downed regionserver can't. >> > >> > Questions: >> > >> > - How can you guys explain this weird situation? >> > - Are there way to recover such lost *data*? >> > >> > Any tips here are definitely appreciated. I'll be happy to provide more >> > information as well.-0 >> >> >
