Hi,

I'm one of the developers of Mono for Android and I just realized that
HTC put some hack into their customized kernel to kill a process after
encountering more than 10 page faults.

I have an HTC Desire HD.

Kernel version:
2.6.32.21-g1e30168
htc-kernel@and18-2 #1 Fri Dec 10 18:43:12 CST 2010

Build number:
1.75.161.2 CL301245 release-keys

Software number:
1.75.161.2

I actually checked the kernel sources of this particular kernel and
found the following in arch/arm/mm/fault.c:

====
void
__do_user_fault(struct task_struct *tsk, unsigned long addr,
                unsigned int fsr, unsigned int sig, int code,
                struct pt_regs *regs)
{
        struct siginfo si;
        struct task_struct *g, *p, *selected = NULL;

#ifdef CONFIG_DEBUG_USER
        if (user_debug & UDBG_SEGV) {
                printk(KERN_DEBUG "%s: unhandled page fault (%d) at 0x
%08lx, code 0x%03x\n",
                       tsk->comm, sig, addr, fsr);
                show_pte(tsk->mm, addr);
                show_regs(regs);
        }
#endif
        if (sig == SIGSEGV)
                tsk->segfault_count++;

        if (tsk->segfault_count > 10) {
                tsk->segfault_count = 0;
                printk(KERN_ERR "unhandled page fault at 0x%08lx, code
0x%03x\n",
                        addr, fsr);
                show_pte(tsk->mm, addr);
                show_regs(regs);

                do_each_thread(g, p) {
                        task_lock(p);
                        if (p == tsk)
                                selected = g;
                        task_unlock(p);
                } while_each_thread(g, p);

                if (selected) {
                        printk(KERN_ERR "%s: triggered too many
segfaults, force killing parent: %s\n",
                                tsk->comm, selected->comm);
                        force_sig(SIGKILL, selected);
                        return;
                }
        }

        tsk->thread.address = addr;
        tsk->thread.error_code = fsr;
        tsk->thread.trap_no = 14;
        si.si_signo = sig;
        si.si_errno = 0;
        si.si_code = code;
        si.si_addr = (void __user *)addr;
        force_sig_info(sig, &si, tsk);
}
====

Is there any reason why they put a restriction like this into their
kernel ?  I'm very surprised to see something like this and it's also
causing problems for our product.

I ran into this because Mono's soft debugger uses page faults to
generate single-step and breakpoint events and all my test apps
silently died when running in the debugger.

I have a patch to work around this by checking some variable rather
than using page faults to single-step / breakpoint events and Mono's
JIT engine already has an option to explicitly check for null
pointers, so the next update of Mono for Android should also work on
this hardware.

However, I'm still worried that a restriction like this may cause some
unforeseeable problems in future.

Does anyone know why they put this patch into their kernel ?  I just
can't think of any good reason to arbitrarily limit the number of page
faults that a process can have - especially if you install a SIGSEGV
signal handler which actually handles these.

Martin

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

Reply via email to