Mike,
Yes, thanks for the help. We had to delete the recovered files generated
from the WAL a few times but that worked. Then we restarted the two tablets
with the TProtocolException exceptions to fix those errors. We saved off
the log files for you.
Jeff
--
Jeff Kubina
410-988-4436
On Fri,
Andrew/Jeff,
How's it going? Did you resolve your issue?
Mike
On Tue, Oct 18, 2016 at 10:42 AM, Andrew Hulbert wrote:
> I think it is attempting to do migrations at the moment FYI
>
> On 10/18/2016 10:40 AM, Andrew Hulbert wrote:
>
> Yes, it looks similar.
>
> Esp these parts:
>
> 2015-11-19 2
I think it is attempting to do migrations at the moment FYI
On 10/18/2016 10:40 AM, Andrew Hulbert wrote:
Yes, it looks similar.
Esp these parts:
2015-11-19 22:43:05,998 [impl.TabletServerBatchReaderIterator] DEBUG:
org.apache.thrift.protocol.TProtocolException: Expected protocol id ff8
Yes, it looks similar.
Esp these parts:
2015-11-19 22:43:05,998 [impl.TabletServerBatchReaderIterator] DEBUG:
org.apache.thrift.protocol.TProtocolException: Expected protocol id ff82
but got 19
java.io.IOException: org.apache.thrift.protocol.TProtocolException: Expected
protocol id ff
Or, if it's more convenient, this is the issue I was thinking of:
https://issues.apache.org/jira/browse/ACCUMULO-4065
Andrew Hulbert wrote:
I'll try to dig up the full error from the tserver
On 10/18/2016 10:30 AM, Josh Elser wrote:
Do you have the full exception for the "Expected protocol i
I'll try to dig up the full error from the tserver
On 10/18/2016 10:30 AM, Josh Elser wrote:
Do you have the full exception for the "Expected protocol id.." error?
That looks like it might be incorrect usage of Thrift on our part..
Andrew Hulbert wrote:
Mike,
So backing up and then later de
Note that the error is more like this:
Expected protocol id ff82 but got 35 (0!;38\\;82,:9997,
)
On 10/18/2016 10:28 AM, Andrew Hulbert wrote:
Mike,
So backing up and then later deleting the recovery directories a few
times did the trick. It seemed that removing the initial bad one
Do you have the full exception for the "Expected protocol id.." error?
That looks like it might be incorrect usage of Thrift on our part..
Andrew Hulbert wrote:
Mike,
So backing up and then later deleting the recovery directories a few
times did the trick. It seemed that removing the initial b
Mike,
So backing up and then later deleting the recovery directories a few
times did the trick. It seemed that removing the initial bad one caused
the others to go through for the most part...
I believe all the WAL files were there. I'll look for the WAL deleted in
the GC logs and see if the
Andrew,
That is what I was going to suggest you try. Where is that "Unable to find
recovery files for extent" log? Anyway we can see some actual logs?
Are all the WALs there? Do you find any of the WAL deleted by GC in the gc
logs? Do you find any duplicates WALs in the HDFS trash?
On Tue, O
Mike,
For one of the WALs I backed up the recovery directory and that
initiated a new recovery attempt as indicated in the tserver debug log...
Then the exception was thrown:
Unable to find recovery files for extent xx logentry x
hdfs://path/to/wal/
Any ideas? I figure we can z
On Tue, Oct 18, 2016 at 6:32 AM, Michael Wall wrote:
> Take a look at the master logs for where the WAL was sorted to the
> /accumulo/recovery/...
> directory. Then look to see if those WALs are still around and contain
> content.
>
Checked one of them, yes it is around with content.
Where is
ever we have two tservers throwing about 70
> exceptions caused by:
>
> java.IO.EOFException: ..../accumulo/recovery/.../part-r-0/index not a
> SequenceFile.
>
> For all the exceptions all the "/accumulo/recovery/.../part-r-0/index"
> files are empty but
caused by:
java.IO.EOFException: /accumulo/recovery/.../part-r-0/index not a
SequenceFile.
For all the exceptions all the
"/accumulo/recovery/.../part-r-0/index" files are empty but their
associated /accumulo/recovery/.../part-r-0/data file is not.
Any suggestions
14 matches
Mail list logo