Were you looking in syslog, and telnet'ing to a tracker and running !watch? The latter's the most useful I think. Getting an error or two to go on might give me some luck in figuring it out.

I'm not up to speed yet on the FSCK job (I'll have to be next week though), so we'll have to hope one of the more in tune folks pipes up here..

-Dormando

Del Raco wrote:
13 devices (10 readonly, 3 alive), the 10 are readonly
because they're about to be full, waiting on some new
drives to arrive so the devices can be rebalanced.

4 trackers (1 fsck worker on each one).

Didn't see any specific errors, but not sure if I were
looking at all the right places.  I manually deleted
fid 1 and 2 from the file table and did a fsck reset
after stopping each time.  The output below are from
running fsck for the third time.  I waited around 30
minutes after seeing the bad fid 678419 before
stopping fsck.
How long should I wait for SRCH to become GONE?  Also,
how do I go about finding how these fids became
orphaned?  Any way to prevent this in the future?

Thanks.

========================================

Output from "mogadm fsck status" (it's not currently
running)

    Running: No
     Status: 669593 / 41601129 (1.61%)
       Time: 37m (297 fids/s; 2294m remain)
 Check Type: Normal (check policy + files)

 [num_NOPA]: 331
 [num_SRCH]: 331

========================================

mysql> select fid, evcode, count(*) from fsck_log
group by fid, evcode;
+--------+--------+----------+
| fid    | evcode | count(*) |
+--------+--------+----------+
| 1 | NOPA | 119 | | 1 | SRCH | 119 | | 2 | NOPA | 147 | | 2 | SRCH | 147 | | 678419 | NOPA | 331 | | 678419 | SRCH | 331 | +--------+--------+----------+
6 rows in set (0.00 sec)

========================================

 mogadm fsck taillog
unixtime             event           fid      devid
1189153502            NOPA        678419          -
1189153502            SRCH        678419          -
1189153508            NOPA        678419          -
1189153508            SRCH        678419          -
1189153513            NOPA        678419          -
1189153513            SRCH        678419          -
1189153519            NOPA        678419          -
1189153519            SRCH        678419          -
1189153524            NOPA        678419          -
1189153524            SRCH        678419          -
1189153530            NOPA        678419          -
1189153530            SRCH        678419          -
1189153535            NOPA        678419          -
1189153535            SRCH        678419          -
1189153541            NOPA        678419          -
1189153541            SRCH        678419          -
1189153546            NOPA        678419          -
1189153546            SRCH        678419          -
1189153551            NOPA        678419          -
1189153551            SRCH        678419          -

========================================

mysql> select min(utime), max(utime) from fsck_log
where fid = 678419;
+------------+------------+
| min(utime) | max(utime) |
+------------+------------+
| 1189151747 | 1189153551 | +------------+------------+
1 row in set (0.00 sec)


Reply via email to