Hi Dave

I would like to discuss the usage of FAULT_ON_ERROR in readmem calls. I have 
now seen a number of situations where this prevents Crash to produce 
appropriate results when some memory is corrupt.

The last problem I saw a few days ago was in kernel.c, in function dumplog
  readmem(log_buf, KVADDR, buf,
          log_buf_len, "log_buf contents", FAULT_ON_ERROR)
The problem was that log_buf_len contained a very large value (memory 
overwrite?) so the readmem failed due to the size. This means of course that it 
was not possible to print the log, but as this function is called during Crash 
startup it also had the consequence that Crash terminated during startup. By 
just changing FAULT_ON_ERROR to RETURN_ON_ERROR and perform a return if the 
readmem failed I could use Crash to investigate this vmcore file, except for 
printing the log.

A second place where I have made some patches in Crash is in function arm_uvtop 
(arm.c). In the readmem calls in this function I have changed FAULT_ON_ERROR to 
RETURN_ON_ERROR and just made a "return FALSE;" if the readmem fails. 
Unfortunately I do not remember the details why I made this change, but I think 
there were a case where Crash terminated during startup and with these changes 
it was possible to investigate the vmcore file.

Another situation I have seen is in help functions like fill_vma_cache and 
fill_file_cache. When I use these functions in extensions the commands will 
fail and terminate immediately if a readmem call fails. In several cases I 
could easily handle such a failure and the command could still produce a lot of 
relevant results.

In the plugins I write I use RETURN_ON_ERROR in principle everywhere and of 
course then handle the error situations myself. I have done this to avoid 
situations as the ones described above.

I am not asking you to remove most usage of FAULT_ON_ERROR, as I realize the 
size and risks with such changes. However I would like to bring up this 
question and hear your views. When working with vmcore files with minor memory 
corruptions, using FAULT_ON_ERROR will limit the usability of Crash.

Jan


Jan Karlsson
Senior Software Engineer
MIB

Sony Mobile Communications
Tel: +46703062174
sonymobile.com<http://sonymobile.com/>

[cid:[email protected]]

<<inline: image001.jpg>>

--
Crash-utility mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/crash-utility

Reply via email to