hemantk-12 commented on PR #5035:
URL: https://github.com/apache/ozone/pull/5035#issuecomment-1629587027

   > > Hey @swamirishi: I'm trying to understand what the actual problem is 
from the stack trace. This is the error I see:
   > > 
   > > * IO error: No such file or directory: While open a file for random 
read: 
/Users/sbalachandran/Documents/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-2ff22622-5883-4e2b-8090-8276162d8e5a/omNode-3/db.snapshots/checkpointState/om.db-aa3c01d4-a80a-47b1-8cde-979b62f0390d/000076.ldb:
 No such file or directory in file 
/Users/sbalachandran/Documents/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-2ff22622-5883-4e2b-8090-8276162d8e5a/omNode-3/db.snapshots/checkpointState/om.db-aa3c01d4-a80a-47b1-8cde-979b62f0390d/MANIFEST-000005
 *
   > > 
   > > I'm not sure what that means. Is it saying that the manifest file is 
missing? or that the ldb file is missing? In either case, the sst filtering 
service doesn't delete either of those types of files, does it? So why do we 
think the sst filtering service needs to be changed?
   > SST Filtering service deletes sst files through the rocksdb api. The 
incremental snapshot figures out the exclude list by walking through the entire 
OM metadata & getting the list of sst files. Manifest files are always taken 
from the leader it is only the sst files which are incremental. While the copy 
is happening some of the sst files acutally might be deleted by sst filtering 
service. Since the manifest file is coming from the leader the db might get 
corrupted since the list of sst files in manifest file & actual filesystems 
don't match. For this we need to stop the sst filtering service before we start 
the copy. @hemantk-12 BTW the keyManager service is shutdown before 
installation the leader snapshot on the follower, so all we are doing here is 
shutting down the sst filtering service before the download to ensure, we don't 
corrupt the db.
   > @GeorgeJahad the ldb file is actually the same sst file that got deleted 
by the sst filtering service.
   
   @swamirishi Can you please update the PR description what you explained 
here? In current description, it is not clear what race condition you are 
trying and creating confusion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to