Phil Stracchino schrieb am 27.05.23 um 20:22:
On 5/27/23 08:17, Dr. Thorsten Brandau wrote:

CHECK TABLE File ;

+-------------+-------+----------+---------------------------------------------------+ | Table       | Op    | Msg_type | Msg_text                                          | +-------------+-------+----------+---------------------------------------------------+ | bacula.File | check | Warning  | InnoDB: The B-tree of index PRIMARY is corrupted. | | bacula.File | check | error    | Corrupt                                           | +-------------+-------+----------+---------------------------------------------------+

And I cannot recreate the indicies as previously with the same error message "ERROR 1030 (HY000): Got error 106 "Transport endpoint is already connected" from storage engine InnoDB

When I check the indicies:

show indexes from File;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored | +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+ | File  |          0 | PRIMARY  |            1 | FileId      | A         |    46738695 |     NULL | NULL   |      | BTREE      |         |               | NO      | +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
1 row in set (0,001 sec)


Well, I tried a bit more and found some hints to remove the AUTO_INCREMENT form the FileId field and readding it. However, I could remove it, but not added again (ERROR 1034 (HY000): Index for table 'File' is corrupt; try to repair it)

So I am pretty much on the verge of having the database killed.


OK Doc, there's one other thing we can try here, and that is a manual table rebuild.  You've already got the SHOW CREATE TABLE from the old table, which was to verify that it was properly created as an InnoDB table, and you've got lots of available storage.

So first create a new empty table:

CREATE TABLE `File_NEW` (
  `FileId` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `FileIndex` int(11) DEFAULT 0,
  `JobId` int(10) unsigned NOT NULL,
  `PathId` int(10) unsigned NOT NULL,
  `Filename` blob NOT NULL,
  `DeltaSeq` smallint(5) unsigned DEFAULT 0,
  `MarkId` int(10) unsigned DEFAULT 0,
  `LStat` tinyblob NOT NULL,
  `MD5` tinyblob DEFAULT NULL,
  PRIMARY KEY (`FileId`)
) ENGINE=InnoDB;


Then we need to copy the rows from the damaged table into it. With the indexes damaged this will be SLOW, but should still be doable.  You have about 70 million rows, vs. my 40 million.


Here's how I would do it:


DROP PROCEDURE IF EXISTS moveFileRecords;

DELIMITER //

CREATE PROCEDURE moveFileRecords(IN _startOffset INT UNSIGNED, IN _batchSize INT UNSIGNED)
BEGIN
    DECLARE _copied, _moved, _offset INT UNSIGNED DEFAULT 0;

    IF (_startOffset IS NOT NULL) THEN
        SET _offset = _startOffset;
    END IF;

    REPEAT
        INSERT IGNORE INTO File_NEW
        SELECT * FROM File
        ORDER BY FileId
        LIMIT _batchSize
        OFFSET _offset;

        SET _copied = ROW_COUNT();
        SET _moved = _moved + _copied;
        SET _offset = _offset + _batchSize;

        SELECT _moved AS 'Total Rows Imported',
            _offset AS 'Current Offset',
            _copied AS 'This batch';

        UNTIL _copied < _batchSize
    END REPEAT;
END //

DELIMITER ;



MySQL 127.0.0.1> CALL moveFileRecords(0,10000);
+---------------------+----------------+------------+
| Total Rows Imported | Current Offset | This batch |
+---------------------+----------------+------------+
|               10000 |          10000 |      10000 |
+---------------------+----------------+------------+
1 row in set (0.562 sec)

...

+---------------------+----------------+------------+
| Total Rows Imported | Current Offset | This batch |
+---------------------+----------------+------------+
|              370000 |         370000 |      10000 |
+---------------------+----------------+------------+
1 row in set (43.798 sec)

...




I'd probably start with a batch size of 10k to verify it's working, then go up to 100k.  You can interrupt it at any time and skip already-copied rows by passing the last 'Current Offset' as _startOffset.


Now, DISCLAIMERS:
This is going to be slow, because the PRIMARY key is borked.  And it's going to slow down as we progress through the table because each pass will have to skip over more and more records.  It may be VERY slow. (Also, your performance WILL differ from mine, because we have very different hardware and database configuration, I'm doing this on a 3-node synchronous cluster.)


There's three reasons for doing it this way:
1.  You can see that it's visibly making progress, not just going out to lunch 2.  Copying the rows in small batches reduces the memory required for each copy operation.  Copying 70 million rows at once is a huge transaction, and if it fails for some reason in the middle, all that work is lost. 3.  If the copy is going to fail for any reason, we want to copy as many rows as possible BEFORE it fails *and know where it failed*.

If it fails partway through, say at 35000000, you MAY be able to resume by passing it a starting offset of 35100000 and see if you can resume after the failure.

I'm also not deleting the copied rows from the source table, for two reasons: 1.  Bulk deletes in MySQL are slow and it would cost us more time than it would save 2.  We want to preserve all of the original data until we're sure we've got it all.


If you prefer, you can TRY doing a single one-shot batch copy first:

INSERT INTO File_NEW SELECT * FROM File;


Well, thank you. I tried everything above, including going from 9999-10000 instead (tried different ranges).

WHATEVER I DO, I only get

ERROR 1034 (HY000): Index for table 'File' is corrupt; try to repair it

back.


But this may run for hours with no visible output.  You can monitor it from another MySQL shell using SHOW PROCESSLIST to verify that it is still running.

No worries with the performance, it takes about 1ms to get the error message :-(

So this did not work either, I seem to be stuck at a pretty much basic problem, where no google search I tried seems to even give ways to understand what problem or how to solve it.

So I decided not to waste more time on the problem, dropped the database, reinstalled the tables with the bacula tool and reinitialized the changer. Now a full new backup is running and I hope that this was a once-in-a-lifetime problem.

Have a nice time everybody and thanks to everyone trying to help

Ciao

T
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to