On Tuesday 29 August 2017 03:03 PM, Jan Kiszka wrote: > On 2017-08-29 11:06, Lokesh Vutla wrote: >> Right now gicv3 driver assumes there is only one cluster >> and handles sgis based on this assumptions. This will fail for systems >> with more than 1 cluster. So add support for affinity based routing. >> >> Signed-off-by: Lokesh Vutla <[email protected]> >> --- >> hypervisor/arch/arm-common/gic-v3.c | 66 >> ++++++++++++++++++++++++++++++------- >> 1 file changed, 54 insertions(+), 12 deletions(-) >> >> diff --git a/hypervisor/arch/arm-common/gic-v3.c >> b/hypervisor/arch/arm-common/gic-v3.c >> index 7e03b64..cb8166c 100644 >> --- a/hypervisor/arch/arm-common/gic-v3.c >> +++ b/hypervisor/arch/arm-common/gic-v3.c >> @@ -274,18 +274,36 @@ static int gic_cell_init(struct cell *cell) >> >> static int gic_send_sgi(struct sgi *sgi) >> { >> - u64 val; >> - u16 targets = sgi->targets; >> + u64 val, cpu; >> + unsigned long targets = 0, tar; >> + u64 mpidr, aff1 = 0, aff2 = 0, aff3 = 0; >> + struct cell *cell; >> >> if (!is_sgi(sgi->id)) >> return -EINVAL; >> >> - if (sgi->routing_mode == 2) >> - targets = 1 << phys_processor_id(); >> + tar = sgi->targets; >> + >> + for_each_cell(cell) { >> + for_each_cpu(cpu, cell->cpu_set) { >> + if (test_bit(cpu, &tar)) { >> + mpidr = per_cpu(cpu)->mpidr; >> + targets |= (1 << MPIDR_AFFINITY_LEVEL(mpidr, >> 0)); >> + aff1 = MPIDR_AFFINITY_LEVEL(mpidr, 1); >> + aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2); >> + aff3 = MPIDR_AFFINITY_LEVEL(mpidr, 3); >> + } >> + } >> + } >> + >> + if (sgi->routing_mode == 2) { >> + mpidr = phys_processor_id(); >> + targets = 1 << MPIDR_AFFINITY_LEVEL(mpidr, 0); >> + } >> >> - val = (u64)sgi->aff3 << ICC_SGIR_AFF3_SHIFT >> - | (u64)sgi->aff2 << ICC_SGIR_AFF2_SHIFT >> - | sgi->aff1 << ICC_SGIR_AFF1_SHIFT >> + val = aff3 << ICC_SGIR_AFF3_SHIFT >> + | aff2 << ICC_SGIR_AFF2_SHIFT >> + | aff1 << ICC_SGIR_AFF1_SHIFT >> | (targets & ICC_SGIR_TARGET_MASK) >> | (sgi->id & 0xf) << ICC_SGIR_IRQN_SHIFT; >> >> @@ -304,17 +322,41 @@ static int gic_send_sgi(struct sgi *sgi) >> return 0; >> } >> >> +#define SGIR_TO_AFFINITY(sgir, level) \ >> + (sgir >> ICC_SGIR_AFF## level ##_SHIFT & 0xff) >> + >> +#define SGIR_TO_MPIDR_AFFINITY(sgir, level) \ >> + (SGIR_TO_AFFINITY(sgir, level) << MPIDR_LEVEL_SHIFT(level)) >> + >> void gicv3_handle_sgir_write(u64 sgir) >> { >> struct sgi sgi; >> unsigned long routing_mode = !!(sgir & ICC_SGIR_ROUTING_BIT); >> + unsigned long targets = 0, tar; >> + struct cell *cell; >> + u64 cluster_id, cpu, clst; >> + u16 aff0; >> + >> + cluster_id = SGIR_TO_MPIDR_AFFINITY(sgir, 3) | >> + SGIR_TO_MPIDR_AFFINITY(sgir, 2) | >> + SGIR_TO_MPIDR_AFFINITY(sgir, 1); >> + tar = sgir & ICC_SGIR_TARGET_MASK; >> + >> + for_each_cell(cell) { >> + for_each_cpu(cpu, cell->cpu_set) { >> + clst = per_cpu(cpu)->mpidr & ~0xffUL; >> + aff0 = MPIDR_AFFINITY_LEVEL(per_cpu(cpu)->mpidr, 0); >> + if ((cluster_id == clst) && (test_bit(aff0, &tar))) { >> + targets |= (1 << cpu); >> + } >> + } >> + } > > These two loops looks heavy, and I'm also concerned if it can be racy: > Consider one non-root cell being destroyed/created in parallel to > another non-root cell still running (absolutely possible and fine so > far). In that case, the list of cells is not stable. So you should never > iterate of all cells while in operational mode. > > Why isn't it enough to iterate over the caller's cell only? This > function here is only used when we trapped a cell while trying to send > an SGI, not to send hypervisor-internal SGIs.
ahh..you are right. It is not required for gicv3_handle_sgir_write. Initially I started with one loop(for_each_cpu(this_cell)) but then realized arm_cpu_kick is using it for internal signaling and added for_each_cell in both the places. My bad, will update it in v2. > > It's different, though, for gic_send_sgi. That function is also used by > arm_cpu_kick, i.e. for hypervisor internal signaling. I suppose, we need > some tables here to do static lookups during runtime instead of list > iterations and ID matches. Since we are looking only for cpus, the complexity will still remain as max cpus in the system. Even if static lookups are implemented over each cpu, we need to iterate for all the cpus which gives the same complexity. Correct me if my understanding is wrong. Thanks and regards, Lokesh > >> >> - /* FIXME: clusters are not supported yet. */ >> - sgi.targets = sgir & ICC_SGIR_TARGET_MASK; >> + sgi.targets = targets; >> sgi.routing_mode = routing_mode; >> - sgi.aff1 = sgir >> ICC_SGIR_AFF1_SHIFT & 0xff; >> - sgi.aff2 = sgir >> ICC_SGIR_AFF2_SHIFT & 0xff; >> - sgi.aff3 = sgir >> ICC_SGIR_AFF3_SHIFT & 0xff; >> + sgi.aff1 = SGIR_TO_AFFINITY(sgir, 1); >> + sgi.aff2 = SGIR_TO_AFFINITY(sgir, 2); >> + sgi.aff3 = SGIR_TO_AFFINITY(sgir, 3); >> sgi.id = sgir >> ICC_SGIR_IRQN_SHIFT & 0xf; >> >> gic_handle_sgir_write(&sgi, true); >> > > Jan > -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
