Re: recovering from failed repair , cassandra 3.10

2017-05-31 Thread Micha
The error which keeps it from starting is below.
The files like
"mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log"
are on both disks of a node but are different .

Of course, just renaming (deleting) the two files (or making them equal)
makes cassandra start again. But I would like to know the right way to
handle this.
I start the repair again with increased log level.


thanks for answering,
 Michael








ERROR 08:26:20 Mismatched line in file
mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log:
got
'ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4348-big,0,8][1394849421]'
expected
'ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4349-big,0,8][910462411]',
giving up
ERROR 08:26:20 Failed to read records for transaction log
[mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log
in
/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0,
/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0]
ERROR 08:26:20 Unexpected disk state: failed to read transaction log
[mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log
in
/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0,
/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0]
Files and contents follow:
/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log

ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4349-big,0,8][910462411]

REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4241-big,1495845618000,8][2443235315]

REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4249-big,1495856254000,8][681858089]
COMMIT:[,0,0][2613697770]
/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc_txn_anticompactionafterrepair_19a46410-459f-11e7-91c7-4f4e8666b5c8.log

ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4348-big,0,8][1394849421]
***Does not match

in first replica file

ADD:[/data/2/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4349-big,0,8][910462411]

REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4241-big,1495845618000,8][2443235315]

REMOVE:[/data/1/cassandra/data/KEYSPACE/TABLE-8e40c6b0f4fa11e6a7912b3358087dc0/mc-4249-big,1495856254000,8][681858089]
COMMIT:[,0,0][2613697770]






On 31.05.2017 11:10, Oleksandr Shulgin wrote:
> On Wed, May 31, 2017 at 9:11 AM, Micha  > wrote:
> 
> Hi,
> 
> after failed repair on a three node cluster all nodes were down.
> 
> 
> To clarify, was it failed repair that brought the nodes down so that you
> had to start them back?  Do you see any error messages or stack trace in
> the logs?
>  
> 
> It cannot start, since it finds a mismatch in a
> mc_txn_anticompactionafterrepair log file:
> "got ADD "
> "expected "ADD:..."
> 
> 
> The two log files are different:
> one has "ADD, ADD; REMOVE, REMOVE, COMMIT"
> the other is missing an "ADD"
> 
> 
> I assume this is about commit log.  There doesn't seem to be a separate
> log file named "mc_txn_anticompactionafterrepair" in your Cassandra version.
> 
> Each of the nodes give this error.
> 
> sstableutil -c  also gives this error.
> 
> How to deal with this?
> 
> 
> I would try removing the faulty commit log file(s) and try to start the
> node again, until it works.  This might mean that you'll have to remove
> all commit logs, but it's better than being completely down, I assume.
> 
> -- 
> Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
> 127-59-707
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: recovering from failed repair , cassandra 3.10

2017-05-31 Thread Oleksandr Shulgin
On Wed, May 31, 2017 at 9:11 AM, Micha  wrote:

> Hi,
>
> after failed repair on a three node cluster all nodes were down.
>

To clarify, was it failed repair that brought the nodes down so that you
had to start them back?  Do you see any error messages or stack trace in
the logs?


> It cannot start, since it finds a mismatch in a
> mc_txn_anticompactionafterrepair log file:
> "got ADD "
> "expected "ADD:..."
>
>
> The two log files are different:
> one has "ADD, ADD; REMOVE, REMOVE, COMMIT"
> the other is missing an "ADD"
>

I assume this is about commit log.  There doesn't seem to be a separate log
file named "mc_txn_anticompactionafterrepair" in your Cassandra version.

Each of the nodes give this error.
>
> sstableutil -c  also gives this error.
>
> How to deal with this?
>

I would try removing the faulty commit log file(s) and try to start the
node again, until it works.  This might mean that you'll have to remove all
commit logs, but it's better than being completely down, I assume.

-- 
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
127-59-707


recovering from failed repair , cassandra 3.10

2017-05-31 Thread Micha
Hi,

after failed repair on a three node cluster all nodes were down.
It cannot start, since it finds a mismatch in a
mc_txn_anticompactionafterrepair log file:
"got ADD "
"expected "ADD:..."


The two log files are different:
one has "ADD, ADD; REMOVE, REMOVE, COMMIT"
the other is missing an "ADD"


Each of the nodes give this error.

sstableutil -c  also gives this error.

How to deal with this?


thanks,
 Michael




-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org