Hi everyone. I'd like to run the following *data* *loss* scenario by you to see if we are doing something obviously wrong with our setup here.
Setup: -cdh3u0 - Hadoop 0.20.2 - HBase 0.90.1 - 1 Master Node running as NameNode & JobTracker -zookeeper quorum - 2 child nodes running as Datanode, TaskTracker and RegionServer each - dfs.replication is set to 1 First, I inserted some data into the hbase a few hours ago. Then after a while. I rebooted one of the region servers and waited until the master responded to that. However, after I checked the table using hbase shell (I used the "count" command), I noticed that there was a huge amount of data being lost. After I restarted the regionserver which I had rebooted and checked again, I found that some of the missing data was got back but there still existed some data which hadn't been found yet. At last,after I disabled the table and then enabled the table , I found that all data was stored in the cluster and there was no data that was lost. This is problematic since we are supposed to replicate at x1, so at least one other node should be able to theoretically serve the *data* that the downed regionserver can't. Questions: - How can you guys explain this weird situation? - Are there way to recover such lost *data*? Any tips here are definitely appreciated. I'll be happy to provide more information as well.-0
