Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-21 Thread Jeff Kubina
Mike, Yes, thanks for the help. We had to delete the recovered files generated from the WAL a few times but that worked. Then we restarted the two tablets with the TProtocolException exceptions to fix those errors. We saved off the log files for you. Jeff -- Jeff Kubina 410-988-4436 On Fri,

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-21 Thread Michael Wall
Andrew/Jeff, How's it going? Did you resolve your issue? Mike On Tue, Oct 18, 2016 at 10:42 AM, Andrew Hulbert wrote: > I think it is attempting to do migrations at the moment FYI > > On 10/18/2016 10:40 AM, Andrew Hulbert wrote: > > Yes, it looks similar. > > Esp these parts: > > 2015-11-19 2

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
I think it is attempting to do migrations at the moment FYI On 10/18/2016 10:40 AM, Andrew Hulbert wrote: Yes, it looks similar. Esp these parts: 2015-11-19 22:43:05,998 [impl.TabletServerBatchReaderIterator] DEBUG: org.apache.thrift.protocol.TProtocolException: Expected protocol id ff8

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
Yes, it looks similar. Esp these parts: 2015-11-19 22:43:05,998 [impl.TabletServerBatchReaderIterator] DEBUG: org.apache.thrift.protocol.TProtocolException: Expected protocol id ff82 but got 19 java.io.IOException: org.apache.thrift.protocol.TProtocolException: Expected protocol id ff

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Josh Elser
Or, if it's more convenient, this is the issue I was thinking of: https://issues.apache.org/jira/browse/ACCUMULO-4065 Andrew Hulbert wrote: I'll try to dig up the full error from the tserver On 10/18/2016 10:30 AM, Josh Elser wrote: Do you have the full exception for the "Expected protocol i

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
I'll try to dig up the full error from the tserver On 10/18/2016 10:30 AM, Josh Elser wrote: Do you have the full exception for the "Expected protocol id.." error? That looks like it might be incorrect usage of Thrift on our part.. Andrew Hulbert wrote: Mike, So backing up and then later de

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
Note that the error is more like this: Expected protocol id ff82 but got 35 (0!;38\\;82,:9997, ) On 10/18/2016 10:28 AM, Andrew Hulbert wrote: Mike, So backing up and then later deleting the recovery directories a few times did the trick. It seemed that removing the initial bad one

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Josh Elser
Do you have the full exception for the "Expected protocol id.." error? That looks like it might be incorrect usage of Thrift on our part.. Andrew Hulbert wrote: Mike, So backing up and then later deleting the recovery directories a few times did the trick. It seemed that removing the initial b

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
Mike, So backing up and then later deleting the recovery directories a few times did the trick. It seemed that removing the initial bad one caused the others to go through for the most part... I believe all the WAL files were there. I'll look for the WAL deleted in the GC logs and see if the

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Michael Wall
Andrew, That is what I was going to suggest you try. Where is that "Unable to find recovery files for extent" log? Anyway we can see some actual logs? Are all the WALs there? Do you find any of the WAL deleted by GC in the gc logs? Do you find any duplicates WALs in the HDFS trash? On Tue, O

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Andrew Hulbert
Mike, For one of the WALs I backed up the recovery directory and that initiated a new recovery attempt as indicated in the tserver debug log... Then the exception was thrown: Unable to find recovery files for extent xx logentry x hdfs://path/to/wal/ Any ideas? I figure we can z

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Jeff Kubina
On Tue, Oct 18, 2016 at 6:32 AM, Michael Wall wrote: > Take a look at the master logs for where the WAL was sorted to the > /accumulo/recovery/... > directory. Then look to see if those WALs are still around and contain > content. > Checked one of them, yes it is around with content. Where is

Re: java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-18 Thread Michael Wall
ever we have two tservers throwing about 70 > exceptions caused by: > > java.IO.EOFException: ..../accumulo/recovery/.../part-r-0/index not a > SequenceFile. > > For all the exceptions all the "/accumulo/recovery/.../part-r-0/index" > files are empty but

java.IO.EOFException: ..../accumulo/recovery/.../part-r-00000/index not a SequenceFile.

2016-10-17 Thread Jeff Kubina
caused by: java.IO.EOFException: /accumulo/recovery/.../part-r-0/index not a SequenceFile. For all the exceptions all the "/accumulo/recovery/.../part-r-0/index" files are empty but their associated /accumulo/recovery/.../part-r-0/data file is not. Any suggestions