On Mon, Jan 9, 2012 at 1:57 PM, James Estes <[email protected]> wrote: > Should we file a ticket for this issue? FWIW we got this fixed (not > sure if we actually lost any data though). We had to bounce the region > server (non-gracefully). The region server seemed to have some stale > file handles into hdfs...open inputstreams to files that were long > deleted in hdfs. Any compactions or anything that would hit the > region would fail b/c it wigged out on the stale handles. Even a > graceful shutdown would get stuck on it. Shutting it down directly > worked, because it comes back up and resets the handles (i guess?). >
Yes. This is what it does. Files are opened on region open ONLY. > So, should we file a ticket for this issue? I'm not sure how we got > in this state, but perhaps there can be some way to recover in the > code if it occurs? We actually tried to repro by deleting a file > straight out of hdfs, but it didn't seem to trigger the issue (but we > tried this in cdh3u2, but had the issue in cdh3u1). > Deleting a file should have done it -- if you then went and did a scan against that files content. My guess it was a double-assignment. If you go back through master logs and trck the history of the region.... you may see it on two servers concurrently at some time in the past. St.Ack
