Re: Page faults in managed dataspaces

Denis Huber Mon, 26 Sep 2016 07:50:37 -0700

Hello Stefan,

thank you for your help and finding the problem :)


Can you tell me, how I can obtain your unofficial upgrade of foc and how 
I can replace Genode's standard version with it?


Kind regards,
Denis

On 26.09.2016 15:15, Stefan Kalkowski wrote:
> Hi Dennis,
>
> I further examined the issue. First, I found out that is is specific to
> Fiasco.OC. If you use another kernel, e.g., Nova, with the same test, it
> succeeds. So I instrumented the core component to always enter
> Fiasco.OC's kernel debugger when core unmapped the corresponding managed
> dataspace. When looking at the page-tables I could see that the mapping
> was successfully deleted. After that I enabled all kind of loggings
> related to page-faults and mapping operations. Lo and behold, after
> continuing and seeing that the "target" thread continued, I re-entered
> the kernel debugger and realized that the page-table entry reappeared
> although the kernel did not list any activity regarding page-faults and
> mappings. To me this is a clear kernel bug.
>
> I've tried out my unofficial upgrade to revision r67 of the Fiasco.OC
> kernel, and with that version it seemed to work correctly (I just tested
> some rounds).
>
> I fear the currently supported version of Fiasco.OC is buggy with
> respect to the unmap call, at least the way Genode has to use it.
>
> Regards
> Stefan
>
> On 09/26/2016 11:13 AM, Stefan Kalkowski wrote:
>> Hi Dennis,
>>
>> I've looked into your code, and what struck me first was that you use
>> two threads in your server, which share data in between
>> (Resource::Client_resources) without synchronization.
>>
>> I've rewritten your example server to only use one thread in a
>> state-machine like fashion, have a look here:
>>
>>
>> https://github.com/skalk/genode-CheckpointRestore-SharedMemory/commit/d9732dcab331cecdfd4fcc5c8948d9ca23d95e84
>>
>> This way it is thread-safe, simpler (less code), and if you are adapted
>> to it, it becomes even easier to understand.
>>
>> Nevertheless, although the possible synchronization problems are
>> eliminated by design, your described problem remains. I'll have a deeper
>> look into our attach/detach implementation of managed dataspaces, but I
>> cannot promise whether this will happen in short time.
>>
>> Best regards
>> Stefan
>>
>> On 09/26/2016 10:44 AM, Sebastian Sumpf wrote:
>>> Hey Denis,
>>>
>>> On 09/24/2016 06:20 PM, Denis Huber wrote:
>>>> Dear Genode Community,
>>>>
>>>> perhaps the wall of text is a bit discouraging to tackle the problem.
>>>> Let me summaries the important facts of the scenario:
>>>>
>>>> * Two components 'ckpt' and 'target'
>>>> * ckpt shares a thread capability of target's main thread
>>>> * ckpt shares a managed dataspace with target
>>>>    * this managed dataspace is initially empty
>>>>
>>>> target's behaviour:
>>>> * target periodically reads and writes from/to the managed dataspace
>>>> * target causes page faults (pf) which are handled by ckpt's pf handler
>>>> thread
>>>>    * pf handler attaches a pre-allocated dataspace to the managed
>>>> dataspace and resolves the pf
>>>>
>>>> ckpt's behaviour:
>>>> * ckpt periodically detaches all attached dataspaces from the managed
>>>> dataspace
>>>>
>>>> Outcome:
>>>> After two successful cycles (pf->attach->detach -> pf->attach->detach)
>>>> the target does not cause a pf, but reads and writes to the managed
>>>> dataspace although it is (theoretically) empty.
>>>>
>>>> I used Genode 16.05 with a foc_pbxa9 build. Can somebody help me with my
>>>> problem? I actually have no idea what could be the problem.
>>>>
>>>>
>>>
>>> You are programming against fairly untested grounds here. There still
>>> might be bugs or corner cases in this line of code. So, someone might
>>> have to look into things (while we are very busy right now). Your
>>> problem is reproducible with [4] right?
>>>
>>> By the way, your way of reporting is exceptional, the more information
>>> and actual test code we have, the better we can debug problems. So,
>>> please keep it this way, even though we might not read all of it at times ;)
>>>
>>> Regards and if I find the time, I will look into your issue,
>>>
>>> Sebastian
>>>
>>>>
>>>>
>>>> On 19.09.2016 15:01, Denis Huber wrote:
>>>>> Dear Genode Community,
>>>>>
>>>>> I want to implement a mechanism to monitor the access of a component to
>>>>> its address space.
>>>>>
>>>>> My idea is to implement a monitoring component which provides managed
>>>>> dataspaces to a target component. Each managed dataspace has several
>>>>> designated dataspaces (allocated, but not attached, and with a fixed
>>>>> location in the managed dataspace). I want to use several dataspaces to
>>>>> control the access range of the target component.
>>>>>
>>>>> Whenever the target component accesses an address in the managed
>>>>> dataspace, a page fault is triggered, because the managed dataspace has
>>>>> no dataspaces attached to it. The page fault is caught by a custom page
>>>>> fault handler. The page fault handler attaches the designated dataspace
>>>>> into the faulting managed dataspace and resolves the page fault.
>>>>>
>>>>> To test my concept I implemented a prototypical system with a monitoring
>>>>> component (called "ckpt") [1] and a target component [2].
>>>>>
>>>>> [1]
>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/b502ffd962a87a5f9f790808b13554d6568f6d0b/src/test/concept_session_rm/server/main.cc
>>>>> [2]
>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/b502ffd962a87a5f9f790808b13554d6568f6d0b/src/test/concept_session_rm/client/main.cc
>>>>>
>>>>> The monitoring component provides a service [3] to receive a Thread
>>>>> capability to pause the target component before detaching the dataspace
>>>>> and resume after detaching and to provide a managed dataspace to the 
>>>>> client.
>>>>>
>>>>> [3]
>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/tree/b502ffd962a87a5f9f790808b13554d6568f6d0b/include/resource_session
>>>>>
>>>>> The monitoring component runs a main loop which pauses the client's main
>>>>> thread and detaches all attached dataspaces from the managed dataspace.
>>>>> The target component also runs a main loop which prints (reads) a number
>>>>> from the managed dataspace to the console and increments (writes) it in
>>>>> the managed dataspaces.
>>>>>
>>>>> The run script is found here [4].
>>>>>
>>>>> [4]
>>>>> https://github.com/702nADOS/genode-CheckpointRestore-SharedMemory/blob/b502ffd962a87a5f9f790808b13554d6568f6d0b/run/concept_session_rm.run
>>>>>
>>>>> The scenario works for the first 3 iterations of the monitoring
>>>>> component: Every 4 seconds it detaches the dataspaces from the managed
>>>>> dataspace and afterwards resolves the page faults by attaching the
>>>>> dataspaces back. After the 3. iteration, the target component accesses
>>>>> the theoretically empty managed dataspaces, but does not trigger a page
>>>>> fault. In fact, it reads and writes to the designated dataspaces as if
>>>>> it was attached.
>>>>>
>>>>> By running the run script I get the following output:
>>>>> [init -> target] Initialization started
>>>>> [init -> target] Requesting session to Resource service
>>>>> [init -> ckpt] Initialization started
>>>>> [init -> ckpt] Creating page fault handler thread
>>>>> [init -> ckpt] Announcing Resource service
>>>>> [init -> target] Sending main thread cap
>>>>> [init -> target] Requesting dataspace cap
>>>>> [init -> target] Attaching dataspace cap
>>>>> [init -> target] Initialization ended
>>>>> [init -> target] Starting main loop
>>>>> Genode::Pager_entrypoint::entry()::<lambda(Genode::Pager_object*)>:Could
>>>>> not resolve pf=6000 ip=10034bc
>>>>> [init -> ckpt] Initialization ended
>>>>> [init -> ckpt] Starting main loop
>>>>> [init -> ckpt] Waiting for page faults
>>>>> [init -> ckpt] Handling page fault: READ_FAULT pf_addr=0x00000000
>>>>> [init -> ckpt]   attached sub_ds0 at address 0x00000000
>>>>> [init -> ckpt] Waiting for page faults
>>>>> [init -> target] 0
>>>>> [init -> target] 1
>>>>> [init -> target] 2
>>>>> [init -> target] 3
>>>>> [init -> ckpt] Iteration #0
>>>>> [init -> ckpt]   valid thread
>>>>> [init -> ckpt]   detaching sub_ds_cap0
>>>>> [init -> ckpt]   sub_ds_cap1 already detached
>>>>> Genode::Pager_entrypoint::entry()::<lambda(Genode::Pager_object*)>:Could
>>>>> not resolve pf=6000 ip=10034bc
>>>>> [init -> ckpt] Handling page fault: READ_FAULT pf_addr=0x00000000
>>>>> [init -> ckpt]   attached sub_ds0 at address 0x00000000
>>>>> [init -> ckpt] Waiting for page faults
>>>>> [init -> target] 4
>>>>> [init -> target] 5
>>>>> [init -> target] 6
>>>>> [init -> target] 7
>>>>> [init -> ckpt] Iteration #1
>>>>> [init -> ckpt]   valid thread
>>>>> [init -> ckpt]   detaching sub_ds_cap0
>>>>> [init -> ckpt]   sub_ds_cap1 already detached
>>>>> [init -> target] 8
>>>>> [init -> target] 9
>>>>> [init -> target] 10
>>>>> [init -> target] 11
>>>>> [init -> ckpt] Iteration #2
>>>>> [init -> ckpt]   valid thread
>>>>> [init -> ckpt]   sub_ds_cap0 already detached
>>>>> [init -> ckpt]   sub_ds_cap1 already detached
>>>>> [init -> target] 12
>>>>> [init -> target] 13
>>>>>
>>>>> As you can see: After "iteration #1" ended, no page fault was caused,
>>>>> although the target component printed and incremented the integer stored
>>>>> in the managed dataspace.
>>>>>
>>>>> Could it be, that the detach method was not executed correctly?
>>>>>
>>>>>
>>>>> Kind regards
>>>>> Denis
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> _______________________________________________
>>>>> genode-main mailing list
>>>>> genode-main@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> genode-main mailing list
>>>> genode-main@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> genode-main mailing list
>>> genode-main@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/genode-main
>>>
>>
>

------------------------------------------------------------------------------
_______________________________________________
genode-main mailing list
genode-main@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/genode-main

Re: Page faults in managed dataspaces

Reply via email to