On Thu, Dec 22, 2011 at 1:34 PM, James Estes <[email protected]> wrote: > We have a 6 node 0.90.3-cdh3u1 cluster. We have 8092 regions. I > realize we have too many regions and too few nodes…we're addressing > that.
Good. > We currently have an issue where we seem to have lost region > data. When data is requested for a couple of our regions, we get > errors like the following on the client: > > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > Failed 1 action: IOException: 1 time, servers with issues: > node13host:60020 > … > java.io.IOException: java.io.IOException: Could not seek > StoreFileScanner[HFileScanner for reader > reader=hdfs://namenodehost:54310/hbase/article/4cbc7c9264820a7b30ddd5755d77ab07/data/6810866521278698568, > compression=none, inMemory=false, > firstKey=95ac7c7894f86d4455885294582370e30a68fdf1/data:acquireDate/1321151006961/Put, > lastKey=95b47d337ff72da0670d0f3803443dd3634681ec/data:text/1323129675986/Put, > avgKeyLen=65, avgValueLen=24, entries=6753283, length=667536405, > cur=null] > … > Caused by: java.io.FileNotFoundException: File does not exist: > /hbase/article/4cbc7c9264820a7b30ddd5755d77ab07/data/6810866521278698568 > If you grep the namenode logs, can you find a history on it? Perhaps this a double-assignment and the other assignment moved this file -- compacted it out of existence? > The file referenced is indeed not in hdfs. Grepping further back in > the logs reveals that the problem has been occuring for over a week > (likely longer, but the logs have rolled off). There are a bunch of > files in /hbase/article/4cbc7c9264820a7b30ddd5755d77ab07/data/ (270 of > them), unsure why they aren't compacting, I looked further in the > logs and find similar exceptions when trying to do a major compaction, > ultimately failing b/c of: > Caused by: java.io.FileNotFoundException: File does not exist: > /hbase/article/4cbc7c9264820a7b30ddd5755d77ab07/data/6810866521278698568 > Its probably not compacting because above happens when it tries. Try closing/moving the region; see shell for how. St.Ack
