Hi,
kirito has left the following comment at Identify and Fix ANY bug that
causes a BRL-CAD tool to crash #4
https://www.google-melange.com/gci/task/view/google/gci2014/4533992846000128:
detail of crah
In computing, a segmentation fault (often shortened to segfault) or access
violation is a fault raised by hardware with memory protection, notifying
an operating system (OS) about a memory access violation; on x86 computers
this is a form of general protection fault. The OS kernel will in response
usually perform some corrective action, generally passing the fault on to
the offending process by sending the process a signal. Processes can in
some cases install a custom signal handler, allowing them to recover on
their own, but otherwise the OS default signal handler is used, generally
causing abnormal termination of the process (a program crash), and
sometimes a core dump.
reproduce segfault
capture core dumps:
$ ulimit -c unlimited
Then run program. It will generate a core file
Then use gdb:
$ gdb ./program core
And gdb will load and run a backtrace to see exactly what operation
elicited the segfault.
The default action for a segmentation fault or bus error is abnormal
termination of the process that triggered it. A core file may be generated
to aid debugging, and other platform-dependent actions may also be
performed. For example, Linux systems using the grsecurity patch may log
SIGSEGV signals in order to monitor for possible intrusion attempts using
buffer overflows.
Writing off the end of the array
Generally, if you're writing off the bounds of an array, then the line that
caused the segfault in the first place should be an array access. (There
are a few times when this won't actually be the case -- notably, if the
fact that you wrote off an array causes the stack to be smashed --
basically, overwriting the pointer that stores where to return after the
function completes.)
Of course, sometimes, you won't actually cause a segfault writing off the
end of the array. Instead, you might just notice that some of your variable
values are changing periodically and unexpectedly. This is a tough bug to
crack; one option is to set up your debugger to watch a variable for
changes and run your program until the variable's value changes. Your
debugger will break on that instruction, and you can poke around to figure
out if that behavior is unexpected.
(gdb) watch [variable name]
Hardware watchpoint 1: [variable name]
(gdb) continue
...
Hardware watchpoint 1: [variable name]
Old value = [value1]
New value = [value2]
This approach can get tricky when you're dealing with a lot of dynamically
allocated memory and it's not entirely clear what you should watch. To
simplify things, use simple test cases, keep working with the same inputs,
and turn off randomized seeds if you're using random numbers!
Stack Overflows
A stack overflow isn't the same type of pointer-related problem as the
others. In this case, you don't need to have a single explicit pointer in
your program; you just need a recursive function without a base case.
Nevertheless, this is a tutorial about segmentation faults, and on some
systems, a stack overflow will be reported as a segmentation fault. (This
makes sense because running out of memory on the stack will violate memory
segmentation.)
To diagnose a stack overflow in GDB, typically you just need to do a
backtrace:
(gdb) backtrace
#0 foo() () at t.cpp:5
#1 0x08048404 in foo() () at t.cpp:5
#2 0x08048404 in foo() () at t.cpp:5
#3 0x08048404 in foo() () at t.cpp:5
[...]
#20 0x08048404 in foo() () at t.cpp:5
#21 0x08048404 in foo() () at t.cpp:5
#22 0x08048404 in foo() () at t.cpp:5
---Type to continue, or q to quit---
If you find a single function call piling up an awfully large number of
times, this is a good indication of a stack overflow.
Typically, you need to analyze your recursive function to make sure that
all the base cases (the cases in which the function should not call itself)
are covered correctly. For instance, in computing the factorial function
int factorial(int n)
{
// What about n < 0?
if(n == 0)
{
return 1;
}
return factorial(n-1) * n;
}
In this case, the base case of n being zero is covered, but what about n <
0? On "valid" inputs, the function will work fine, but not on "invalid"
inputs like -1.
You also have to make sure that your base case is reachable. Even if you
have the correct base case, if you don't correctly progress toward the base
case, your function will never terminate.
int factorial(int n)
{
if(n <= 0)
{
return 1;
}
// Ooops, we forgot to subtract 1 from n
return factorial(n) * n;
}
Greetings,
The Google Open Source Programs Team
---
You are receiving this message because you are subscribed to Identify and
Fix ANY bug that causes a BRL-CAD tool to crash #4.
To stop receiving these messages, go to:
https://www.google-melange.com/gci/task/view/google/gci2014/4533992846000128.
------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
vanity: www.gigenet.com
_______________________________________________
BRL-CAD Tracker mailing list
brlcad-tracker@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/brlcad-tracker