empiredan opened a new issue, #1207:
URL: https://github.com/apache/incubator-pegasus/issues/1207

   Both [Test Release 
(dsn.replica.test)](https://github.com/apache/incubator-pegasus/actions/runs/3326102384/jobs/5505382914#logs)
 and [Test ASAN 
(dsn.replica.test)](https://github.com/apache/incubator-pegasus/actions/runs/3326102384/jobs/5505387446#logs)
 for dsn.replica.test have failed in CI workflows of 
https://github.com/apache/incubator-pegasus/pull/1205, with reported errors as 
follows:
   
   ```
   [ RUN      ] mutation_log_test.reset_from
   E2022-10-26 11:56:44.166 (1666785404166907303 282) 
replica.default0.0000010100010001: replica.cpp:569:init_disk_tag(): [1.1@] get 
disk tag of ./test-log failed: ERR_OBJECT_NOT_FOUND, init it to empty 
   W2022-10-26 11:56:44.172 (1666785404172717263 282) 
replica.default0.0000010100010001: filesystem.cpp:381:rename_path(): rename 
from './test-log.1666785404172467160' to './test-log' failed, err = Directory 
not empty
   F2022-10-26 11:56:44.172 (1666785404172723363 282) 
replica.default0.0000010100010001: mutation_log.cpp:970:operator()(): assertion 
expression: false
   F2022-10-26 11:56:44.172 (1666785404172728763 282) 
replica.default0.0000010100010001: mutation_log.cpp:970:operator()(): rollback 
./test-log.1666785404172[467](https://github.com/apache/incubator-pegasus/actions/runs/3326102384/jobs/5505382914#step:7:468)160
 to ./test-log failed
   Aborted (core dumped)
   Error: Process completed with exit code 134.
   ```
   
   Review the code of `mutation_log::reset_from` with the reported errors, some 
problems could be found.
   
   Firstly, the reason for core dump is that `err` in capture list of the 
lambda function for `dsn::defer` is passed as value rather than a reference. As 
a result, `err` for lambda function will always be `ERR_FILE_OPERATION_FAILED`; 
also, `utils::filesystem::rename_path(temp_dir, _dir)` will be called 
mistakenly. See the following code for details:
   ```c++
       // define `defer` for rollback temp_dir when failed or remove temp_dir 
when success
       auto temp_dir_resolve = dsn::defer([this, err, temp_dir]() {
           if (err != ERR_OK) {
               if (!utils::filesystem::rename_path(temp_dir, _dir)) {
                   // rollback failed means old log files are not be recovered, 
it may be lost if only
                   // LOG_ERROR,  dassert for manual resolve it
                   dassert_f("rollback {} to {} failed", temp_dir, _dir);
               }
           } else {
               if (!dsn::utils::filesystem::remove_path(temp_dir)) {
                   // temp dir allow delete failed, it's only garbage
                   LOG_ERROR_F("remove temp dir {} failed", temp_dir);
               }
           }
       });
   ```  
   
   Secondly, from the above code, another error can be found for 
`dassert_f("rollback {} to {} failed", temp_dir, _dir);`. The first parameter 
for `dassert_f` should have been a condition; however, "`rollback {} to {} 
failed`" is passed as the first parameter by mistake rather than the condition. 
Therefore, this assertion will never failed since this string will always be 
converted implicitly as true.
   
   Thirdly, the following code also renames a directory, however there is not 
any rollback process for this: 
   ```c++
       // move source dir to target dir
       if (!utils::filesystem::rename_path(dir, _dir)) {
           LOG_ERROR_F("rename {} to {} failed", dir, _dir);
           return err;
       }
       LOG_INFO_F("move {} to {} as our new log directory", dir, _dir);
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to