Yeah, during the tests we did with Gan, I had a CAS on the in_use. Let me put that in. I would not want the tracing code disabled in the early stages, if it is fixable. We would lose a bunch of info spitted out before. Setting up GS base as first thing on remote core boots, will help.
On Tue, Dec 8, 2015 at 8:22 AM, Barret Rhoden <[email protected]> wrote: > On 2015-12-08 at 06:27 "'Davide Libenzi' via Akaros" > <[email protected]> wrote: > > After looking a bit more about this bug, this was, /methink, more > > triggered by memory structures being just at different positions due > > to new Gan code, more than fiddling with GS_BASE MSR. > > There is basically a window of time, where num_cores >0 and GS/percpu > > is not setup correctly, but still pointing somewhere we do not GPF. > > I guess this is after topology_init() and before smp_boot() > (arch_init() -> smp_boot() -> smp_final_core_init()). > > This is somewhat true. Core 0's GS should be set up in assembly before > we even get into kernel_init(). k/a/x/entry64.S. So the above window > only should exist for cores other than core 0. > > Any idea which core was flipping out? (some non-zero core could be > printing during smp_boot during an APIC access, which is my guess). > > Also, I'm not sure this fix is enough: > > static struct trace_printk_buffer *kprof_get_printk_buffer(void) > { > static struct trace_printk_buffer boot_tpb; > static struct trace_printk_buffer *cpu_tpbs; > > if (unlikely(!num_cores)) > return &boot_tpb; > if (unlikely(!cpu_tpbs)) { > /* Poor man per-CPU data structure. I really do no like > * littering global data structures with module specific data. > */ > spin_lock_irqsave(&ktrace_lock); > if (!cpu_tpbs) > cpu_tpbs = kzmalloc(num_cores * sizeof(struct > trace_printk_buffer), 0); > spin_unlock_irqsave(&ktrace_lock); > } > > return cpu_tpbs + core_id_early(); > } > > (this is with your change). > > If we get a printk from a core != 0, but before smp_final_core_init() > finishes (specifically, when we set core_id_ready = TRUE;), then > core_id_early will return 0, and some core other than core 0 will get > the trace buffer for core 0. > > Odds are, there's a race associated with that. For instance: > > char *usrbuf = tpb->buffer, *kpbuf = tpb->buffer + usr_bufsz; > const char *utop, *uptr; > char hdr[64]; > > if (tpb->in_use) > return; > tpb->in_use++; > > That is racy if there are concurrent users of tpb. > > One solution is to not use trace_printk for all printks before > smp_boot. That's lousy since we'd like to have all of that saved. > > Another would be to not use per-core buffers for trace_printk. It's not > clear that we need that level of optimization for regular printk (which > is already serialized, btw), but maybe we would like it for the > tracer. I'd also like to keep the "cpuX:" formatting. So that stinks > too. > > Another option would be to have a call to disable trace_vprintk > temporarily, and smp_boot can turn it off and on. > > In general, it's hard to do per-cpu things when we're in the window of > having multiple cores but not having completed smp initialization. > > FWIW, we'll probably run into this problem in the future with other > things. > > Barret > > p.s. https://www.youtube.com/watch?v=6bOy3RNyWME > > -- > You received this message because you are subscribed to the Google Groups > "Akaros" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Akaros" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
