On 2015/4/29 10:37, Gang He wrote: > Hi Joseph, > > Thanks for your detailed description. > See my question inline. > > >>>> >> Hi Goldwyn, >> >> Thanks for the good proposal. >> >> On 2015/4/28 20:21, Goldwyn Rodrigues wrote: >>> Hi Gang, >>> >>> On 04/27/2015 10:00 PM, Gang He wrote: >>>> Hi Glodwyn, >>>> >>>> Very nice proposal. >>>> So far, there are some comments from me. >>>> 1) which task will we do in check/fix a file, we need to define the >>>> detailed >> requirements further, since we just do a light-level file check/fix >> according >> to inode number, we need to know which items can be done by online check, >> which items can be done by offline fsck. >>> >>> For the first phase (regular files), these are all the reasons the disk >> validate function would fail. Some examples are ocfs2_validate_inode_block, >> ocfs2_validate_extent_block etc. >>> As we take up system inodes (phase 2), we will add more functionality. >>> >> Can we classify all corrupted cases and their corresponding fix ways? Maybe >> we can get some hints from fsck. >> And I don't think errors=continue can fit for all cases. >> For some cases we shouldn't let it continue with errors to prevent more >> damages. >> >>>> 2) can we keep check and fix two option, check option is to check if a >>>> file >> is good or bad, but not modify anything, fix option is to check and fix a >> file if the file is corrupted. >>> >>> Yes, there are two options, CHECKS only checks wheras FIX fixes the errors. >> As a precautionary measure, a CHECK command should be provided before a FIX >> is issued. IOW, a file should be checked for errors before actually fixing >> it. >>> >> A convenient way to know which to be checked should also be taken into >> consideration. >> >>>> 3) when users execute the command "echo CHECK <inode> > >> /sys/fs/ocfs2/filecheck" to check a file, how to give the feedback >> information besides printing the messages to syslog? >>> >>> The output should be when you cat /sys/fs/ocfs2/filecheck. It would provide >> the results of the last (N) files checked. I don't want to flood the kernel >> log with this. Thanks for bringing this up, I will put it on the doc. >> Something like: >>> >>> Inode Status Description >>> 1234 ERROR Metadata incorrect >>> 2352 FIXED Valid flag not set >>> 9382 CHECKING - >>> 8926 GOOD - >>> 7230 CANT-FIX Please execute fsck.ocfs2 after taking filesystem offline. >>> >>> So, for the current scenario, only 1234 can be fixed. An echo should err >> with EINVAL if any other inode number is provided with FIX. >>> >>> >>>> 4) we should support a list to accept the "check/fix" requests from >> user-space and queue them, then handle them one by one, right? what is the >> behavior for the request user which execute "echo check ..." from the user >> space? the user post a request to the kernel space, then the command will >> end >> or wait for the file check end? >>>> >>> >>> I would not suggest that, atleast for now. This is to improve availability. >> However, if the filesystem is very bad, we should suggest an offline check. >> However, the user can provide multiple CHECK requests. > My question is, if users can execute "echo check > .." to check/fix files > simultaneously? since users can trigger this command from different > terminates. I think we have to restrict it. Since offline fsck is also not supposed to allow such a case. If we have to, maybe user dlm can take care of this.
> Second, users send a command to kernel space, the kernel space have to cache > these commands in a list/array, since kernel can not finish a check request > immediately, otherwise, how does the kernel accept a new request during the > kernel are handing the current request. I think the operations should be done one by one. IMO, kernel finds the corruption and reports to user space. In user space we maintain a corruptions list. Then user check/fix one by one. > > Thanks > Gang > >>> > > > . > _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel