Now I meet a strange problem. After the first sync init, it enters to the realtime replication state. I deployed them on 3 machines, and have run near half month. Suddenly one day, a host, I don't know what's wrong, I found fs_mirror get the empty records from its mirrord agent. In normal conditions, these records should be: "CREATE:/var/www/html" "FWRITE:/var/www/html/index.php" "DELETE:/var/www/html/temp" "MOVE:('/var/www/html/aa', '/var/www/html/bb')" ... But should be no empty records. This lead to fs_mirror to a dead infinite loop.
I restart the fs_mirror from the broken point, but the problem remains, after DEBUG I found the problem occurs at the same serial number every time(I use Berkeley DB as the log record(wmLog), and serial numbers are the keys), so I suspect that the problem is BDB, but I don't know how to test and locate to the right place. I tried to open the orignal db file in Python: >>> import bsddb >>> x = bsddb.btopen("/var/mirrord/wmlog" >>> len(x) 623748 >>> x["6854"] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in __getitem__ return _DeadlockWrap(lambda: self.db[key]) # self.db[key] File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap return function(*_args, **_kwargs) File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in <lambda> return _DeadlockWrap(lambda: self.db[key]) # self.db[key] KeyError: '6854' >>> x[str(6854)] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in __getitem__ return _DeadlockWrap(lambda: self.db[key]) # self.db[key] File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap return function(*_args, **_kwargs) File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in <lambda> return _DeadlockWrap(lambda: self.db[key]) # self.db[key] KeyError: '6854' >>> x.first() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/bsddb/__init__.py", line 278, in first rv = _DeadlockWrap(self.dbc.first) File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap return function(*_args, **_kwargs) _bsddb.DBNotFoundError: (-30990, 'DB_NOTFOUND: No matching key/data pair found Even I have stopped the mirrord daemon, the errors remain. Then I tried to copy and move out the database file, and open the new dbfile: >>> import bsddb >>> x = bsddb.btopen("/tmp/wmlog") >>> len(x) 0 the length is 0, and getitem get the same errors above. Why this occurs when I copy the db file? Especially the len() is 0?! I can only restart the mirrord, to rebuild the BDB data file, and so far, this problem does not occurs again. I don't know why there is a occasional problem like this? Is there any one be familiar with BDB can give me several advices? Thanks. -- ------------------------------------------ My Projects: http://sourceforge.net/projects/crablfs http://crablfs.sourceforge.net/ http://crablfs.sourceforge.net/#ru_data_man http://crablfs.sourceforge.net/tree.html http://cralbfs.sourceforge.net/sysadm_zh_CN.html My Blog: http://chowroc.blogspot.com/ http://hi.baidu.com/chowroc_z/ Looking for a space and platform to exert my originalities (for my projects)...
-- http://mail.python.org/mailman/listinfo/python-list