Re: [Xenomai-core] [PATCH] shared irqs v.3
Hi Jan, As lighter may mean that reducing the structure size also reduces the number of used cache lines, it might be a good idea. The additional complexity for entry removal is negligible. My current working version is already lighter when it comes to the size of additional data structures. It's implemented via the one-way linked list instead of xnqueue_t. This way, it's 3 times lighter for UP and 2 times for SMP systems. I'll try to post it today later. The only problem remaining is the compilation issues so I should fix it before posting, namely: it looks like some code in kscr/nucleus (e.g. intr.c) is used for compiling both kernel-mode code (of cousre) and user-mode (maybe for UVM, though I haven't looked at it thoroughly yet). The link to ksrc/nucleus is created in the src/ directory. Both the IPIPE_NR_IRQS macro and rthal_critical_enter/exit() calls are undefined when intr.c is compiled for the user-mode side. That's why it so far contains those __IPIPE_NR_IRQS and external int rthal_critical_enter/exit() definitions. I hope that also answers your another question later on this mail. Beleive it or not, I have considered different ways to guarantee that a passed cookie param is valid (xnintr_detach() has not deleted it) and remains to be so while the xnintr_irq_handler() is running. And there are some obstacles there... I'll post them later if someone is interested since I'm short of time now :) ... I'm interested... Ok. So I will have at least a reader :) Actually, I still hope to find out some solution so to make use of the recently extended ipipe interface as it was supposed to be used (then there is no need for any per-irq xnshirqs array in intr.c). Otherwise, I have to admit that my recent work with that ipipe extension (I can ay it since I made it) is of no big avail. Maybe we together will find out a solution. That code is compiled for the user-mode code also and the originals are not available. So consider it a temp solution for test purposes, I guess it's easily fixable. test/shirq.c - is a test module. SHIRQ_VECTOR must be the one used by Linux, e.g. I have used 12 that's used by the trackball device. I haven't tried your code yet, but in the preparation of a real scenario I stumbled over a problem in my serial driver regarding IRQ sharing: In case you want to use xeno_16550A for ISA devices with shared IRQs, an iteration over the list of registered handlers would be required /until/ no device reported that it handled something. This is required so that the IRQ line gets released for a while and system obtains a chance to detect a new /edge/ triggered IRQ - ISA oddity. That's the way most serial drivers work, but they do it internally. So the question arose for me if this edge-specific handling shouldn't be moved to the nucleus as well (so that I don't have to fix my 16550A ;)). Brrr... frankly speaking, I haven't got it clearly so don't want to make pure speculations. Probably I have to take a look at the xeno_16550A driver keeping in mind your words. Another optimisation idea, which I once also realised in my own shared IRQ wrapper, is to use specialised trampolines at the nucleus level, i.e. to not apply the full sharing logic with its locking and list iterations for non-shared IRQs. What do you think? Worth it? Might be when the ISA/edge handling adds further otherwise unneeded overhead. Yep, maybe. But let's take something working first.. Jan -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH] shared irqs v.3
Hi Jan, As lighter may mean that reducing the structure size also reduces the number of used cache lines, it might be a good idea. The additional complexity for entry removal is negligible. My current working version is already lighter when it comes to the size of additional data structures. It's implemented via the one-way linked list instead of xnqueue_t. This way, it's 3 times lighter for UP and 2 times for SMP systems. I'll try to post it today later. The only problem remaining is the compilation issues so I should fix it before posting, namely: it looks like some code in kscr/nucleus (e.g. intr.c) is used for compiling both kernel-mode code (of cousre) and user-mode (maybe for UVM, though I haven't looked at it thoroughly yet). The link to ksrc/nucleus is created in the src/ directory. Both the IPIPE_NR_IRQS macro and rthal_critical_enter/exit() calls are undefined when intr.c is compiled for the user-mode side. That's why it so far contains those __IPIPE_NR_IRQS and external int rthal_critical_enter/exit() definitions. I hope that also answers your another question later on this mail. Beleive it or not, I have considered different ways to guarantee that a passed cookie param is valid (xnintr_detach() has not deleted it) and remains to be so while the xnintr_irq_handler() is running. And there are some obstacles there... I'll post them later if someone is interested since I'm short of time now :) ... I'm interested... Ok. So I will have at least a reader :) Actually, I still hope to find out some solution so to make use of the recently extended ipipe interface as it was supposed to be used (then there is no need for any per-irq xnshirqs array in intr.c). Otherwise, I have to admit that my recent work with that ipipe extension (I can ay it since I made it) is of no big avail. Maybe we together will find out a solution. That code is compiled for the user-mode code also and the originals are not available. So consider it a temp solution for test purposes, I guess it's easily fixable. test/shirq.c - is a test module. SHIRQ_VECTOR must be the one used by Linux, e.g. I have used 12 that's used by the trackball device. I haven't tried your code yet, but in the preparation of a real scenario I stumbled over a problem in my serial driver regarding IRQ sharing: In case you want to use xeno_16550A for ISA devices with shared IRQs, an iteration over the list of registered handlers would be required /until/ no device reported that it handled something. This is required so that the IRQ line gets released for a while and system obtains a chance to detect a new /edge/ triggered IRQ - ISA oddity. That's the way most serial drivers work, but they do it internally. So the question arose for me if this edge-specific handling shouldn't be moved to the nucleus as well (so that I don't have to fix my 16550A ;)). Brrr... frankly speaking, I haven't got it clearly so don't want to make pure speculations. Probably I have to take a look at the xeno_16550A driver keeping in mind your words. Another optimisation idea, which I once also realised in my own shared IRQ wrapper, is to use specialised trampolines at the nucleus level, i.e. to not apply the full sharing logic with its locking and list iterations for non-shared IRQs. What do you think? Worth it? Might be when the ISA/edge handling adds further otherwise unneeded overhead. Yep, maybe. But let's take something working first.. Jan -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [PATCH] shared irqs v.3
Hi Dmitry, Dmitry Adamushko wrote: Hi, here goes another implementation of shared irqs on the nucleus layer. I have conducted a few tests and it seems to work. The test example is attached. There were 2 main issues concerning synchronization: 1) xnintr_attach() vs. xnintr_detach() (and each of them vs. itself) The problem is that we can't use the nklock (nor any other lock + irq off) as Gilles pointed out. A possible solution: o something lick xnlock_get/put() There is no irqsave/restore -less interface of xnlock_get/put available. For pure locking scheme (without touching the irqs) the concept of _preemption_ (to prevent a thread from being preempted while in a locked section) must be introduced and, at the first glance, that would be quite difficult since it must be consistent across all domains (if only for the primary - that's easy). o rthal_critical_enter/exit() This one is used currently. 2) xnintr_attach/detach() vs. xnintr_irq_handler() The problem here is how to be sure that 1) the xnintr_shirq_t object is valid (when dynamically allocated) and 2) to be safe while iterating through the handlers list. Currently, 1) is allowed by the static xnintr_shirq_t xnshirqs[IPIPE_NR_IRQS]. Ok, it can be done lighter when a one-way-list is used instead of xnqueue_t. As lighter may mean that reducing the structure size also reduces the number of used cache lines, it might be a good idea. The additional complexity for entry removal is negligible. Beleive it or not, I have considered different ways to guarantee that a passed cookie param is valid (xnintr_detach() has not deleted it) and remains to be so while the xnintr_irq_handler() is running. And there are some obstacles there... I'll post them later if someone is interested since I'm short of time now :) ... I'm interested... There are a few ugly things in code, namely __IPIPE_NR_IRQS and definitions of rthal_critical_enter/exit(). I do understand the first issue but not what you mean with the second one. That code is compiled for the user-mode code also and the originals are not available. So consider it a temp solution for test purposes, I guess it's easily fixable. test/shirq.c - is a test module. SHIRQ_VECTOR must be the one used by Linux, e.g. I have used 12 that's used by the trackball device. I haven't tried your code yet, but in the preparation of a real scenario I stumbled over a problem in my serial driver regarding IRQ sharing: In case you want to use xeno_16550A for ISA devices with shared IRQs, an iteration over the list of registered handlers would be required /until/ no device reported that it handled something. This is required so that the IRQ line gets released for a while and system obtains a chance to detect a new /edge/ triggered IRQ - ISA oddity. That's the way most serial drivers work, but they do it internally. So the question arose for me if this edge-specific handling shouldn't be moved to the nucleus as well (so that I don't have to fix my 16550A ;)). Another optimisation idea, which I once also realised in my own shared IRQ wrapper, is to use specialised trampolines at the nucleus level, i.e. to not apply the full sharing logic with its locking and list iterations for non-shared IRQs. What do you think? Worth it? Might be when the ISA/edge handling adds further otherwise unneeded overhead. Jan signature.asc Description: OpenPGP digital signature