Jeff, Take a look at the master logs for where the WAL was sorted to the /accumulo/recovery/... directory. Then look to see if those WALs are still around and contain content.
Where is this this EOF exception, on a tserver? Is the master log complaining about anything? Mike On Mon, Oct 17, 2016 at 6:15 PM, Jeff Kubina <jeff.kub...@gmail.com> wrote: > We had a lot of datanodes lock up nearly simaltanuously in our Accumulo > instance. Many more of the tservers also went offline. After about two > hours we were able to get all the datanodes and tservers back online with > no HDFS blocks lost. However we have two tservers throwing about 70 > exceptions caused by: > > java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a > SequenceFile. > > For all the exceptions all the "..../accumulo/recovery/.../part-r-00000/index" > files are empty but their associated > ..../accumulo/recovery/.../part-r-00000/data > file is not. > > Any suggestions on how we can best recover from these exceptions? > >