Thank you, this is a patch that can be used!
On Monday, September 4, 2023 at 7:02:07 PM UTC+8 Ralf Ramsauer wrote:
>
>
> On 04/09/2023 10:11, bot crack wrote:
> > The jailhouse system hangs when running on the 4*A55 board
> >
> > jailhouse-config-check:
> > /Reading configuration set:
> > Architecture: arm64
> > Root cell: RootCell (a55-main.cell)
> > Overlapping memory regions inside cell: None
> > Overlapping memory regions with hypervisor: None
> > Missing PCI MMCONFIG interceptions: None
> > Missing resource interceptions for architecture arm64: None/
> >
> > jailhouse Started output:/
> > Initializing Jailhouse hypervisor on CPU 3
> > Code location: 0x0000ffffc0200800
> > Page pool usage after early setup: mem 39/992, remap 0/131072
> > Initializing processors:
> > CPU 3... OK
> > CPU 2... OK
> > CPU 0... OK
> > CPU 1... OK
> > Initializing unit: irqchip
> > Initializing unit: ARM SMMU v3
> > Initializing unit: ARM SMMU
> > Initializing unit: PVU IOMMU
> > Initializing unit: PCI
> > Adding virtual PCI device 00:00.0 to cell "RootCell"
> > Page pool usage after late setup: mem 64/992, remap 144/131072
> > Activating hypervisor/
> >
> >
> >
> >
> >
> > 1. When I run the weston (using openGL ES) program on Linux in the root
> > cell, it will cause a CPU to occupy 100%. I use ftrace to track the CPU
> > and display(See attachment for full output):
> > /# tracer: function_graph
> > #
> > # CPU DURATION FUNCTION CALLS
> > # | | | | | | |
> > 2) | _raw_spin_lock_irqsave() {
> > 2) 0.583 us | do_raw_spin_lock();
> > 2) 3.500 us | }
> > 2) | ktime_get_update_offsets_now() {
> > 2) 0.583 us | arch_counter_read();
> > 2) 1.750 us | }
> > 2) | __hrtimer_run_queues() {
> > 2) | _raw_spin_unlock_irqrestore() {
> > 2) 0.583 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) | tick_sched_timer() {
> > 2) | ktime_get() {
> > 2) 0.875 us | arch_counter_read();
> > 2) 1.750 us | }
> > 2) | tick_sched_do_timer() {
> > 2) | tick_do_update_jiffies64.part.0() {
> > 2) | _raw_spin_lock() {
> > 2) 0.583 us | do_raw_spin_lock();
> > 2) 1.750 us | }
> > 2) | do_timer() {
> > 2) 0.584 us | calc_global_load();
> > 2) 1.750 us | }
> > 2) | _raw_spin_unlock() {
> > 2) 0.584 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) | update_wall_time() {
> > 2) | timekeeping_advance() {
> > 2) | _raw_spin_lock_irqsave() {
> > 2) 0.584 us | do_raw_spin_lock();
> > 2) 2.042 us | }
> > 2) 0.584 us | arch_counter_read();
> > 2) 0.583 us | ntp_tick_length();
> > 2) 0.583 us | ntp_tick_length();
> > 2) | timekeeping_update() {
> > 2) 0.583 us | ntp_get_next_leap();
> > 2) 0.875 us | update_vsyscall();
> > 2) 0.583 us | raw_notifier_call_chain();
> > 2) 4.375 us | }
> > 2) | _raw_spin_unlock_irqrestore()
> {
> > 2) 0.583 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) + 14.000 us | }
> > 2) + 15.166 us | }
> > 2) + 23.334 us | }
> > 2) + 24.500 us | }
> > 2) | update_process_times() {
> > 2) | account_process_tick() {
> > 2) | account_system_time() {
> > 2) | account_system_index_time() {
> > 2) | cpuacct_account_field() {
> > 2) 0.583 us | __rcu_read_lock();
> > 2) 0.584 us | __rcu_read_unlock();
> > 2) 3.208 us | }
> > 2) 0.584 us | __rcu_read_lock();
> > 2) 0.583 us | __rcu_read_unlock();
> > 2) | cpufreq_acct_update_power() {
> > 2) | _raw_spin_lock_irqsave() {
> > 2) 0.584 us | do_raw_spin_lock();
> > 2) 1.750 us | }
> > 2) |
> > _raw_spin_unlock_irqrestore() {
> > 2) 0.583 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) 5.250 us | }
> > 2) + 12.542 us | }
> > 2) + 13.708 us | }
> > 2) + 14.875 us | }
> > 2) | run_local_timers() {
> > 2) 0.584 us | hrtimer_run_queues();
> > 2) 1.750 us | }
> > 2) | rcu_sched_clock_irq() {
> > 2) 0.875 us | rcu_is_cpu_rrupt_from_idle();
> > 2) 0.584 us | rcu_qs();
> > 2) 0.583 us | rcu_is_cpu_rrupt_from_idle();
> > 2) 0.583 us | rcu_segcblist_ready_cbs();
> > 2) 5.541 us | }
> > 2) | scheduler_tick() {
> > 2) | _raw_spin_lock() {
> > 2) 0.584 us | do_raw_spin_lock();
> > 2) 1.750 us | }
> > 2) 0.584 us | update_rq_clock();
> > 2) 0.875 us | update_thermal_load_avg();
> > 2) | task_tick_fair() {
> > 2) | update_curr() {
> > 2) 0.583 us | update_min_vruntime();
> > 2) | cpuacct_charge() {
> > 2) 0.583 us | __rcu_read_lock();
> > 2) 0.875 us | __rcu_read_unlock();
> > 2) 4.083 us | }
> > 2) 0.584 us | __rcu_read_lock();
> > 2) 0.583 us | __rcu_read_unlock();
> > 2) 9.041 us | }
> > 2) 0.875 us | __update_load_avg_se();
> > 2) | __update_load_avg_cfs_rq() {
> > 2) 0.875 us | __accumulate_pelt_segments();
> > 2) 2.042 us | }
> > 2) 0.584 us | update_cfs_group();
> > 2) 0.583 us | hrtimer_active();
> > 2) + 16.625 us | }
> > 2) 0.583 us | calc_global_load_tick();
> > 2) | _raw_spin_unlock() {
> > 2) 0.875 us | do_raw_spin_unlock();
> > 2) 2.042 us | }
> > 2) | trigger_load_balance() {
> > 2) 1.166 us | nohz_balance_exit_idle();
> > 2) 0.583 us | __rcu_read_lock();
> > 2) 0.583 us | __rcu_read_unlock();
> > 2) 4.959 us | }
> > 2) + 32.083 us | }
> > 2) 0.584 us | run_posix_cpu_timers();
> > 2) + 58.333 us | }
> > 2) 0.584 us | hrtimer_forward();
> > 2) + 88.375 us | }
> > 2) | _raw_spin_lock_irq() {
> > 2) 0.583 us | do_raw_spin_lock();
> > 2) 1.750 us | }
> > 2) 0.875 us | enqueue_hrtimer();
> > 2) + 95.666 us | }
> > 2) 0.875 us | __hrtimer_next_event_base();
> > 2) 0.875 us | __hrtimer_next_event_base();
> > 2) | _raw_spin_unlock_irqrestore() {
> > 2) 0.583 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) | tick_program_event() {
> > 2) | clockevents_program_event() {
> > 2) | ktime_get() {
> > 2) 0.583 us | arch_counter_read();
> > 2) 1.750 us | }
> > 2) 0.875 us | arch_timer_set_next_event_phys();
> > 2) 4.084 us | }
> > 2) 5.250 us | }
> > 2) ! 114.333 us | } /* hrtimer_interrupt */
> > 2) ! 115.792 us | } /* arch_timer_handler_phys */
> > 2) 0.583 us | gic_eoimode1_eoi_irq();
> > 2) ! 118.125 us | } /* handle_percpu_devid_irq */
> > 2) | irq_exit() {
> > 2) 0.584 us | idle_cpu();
> > 2) 1.750 us | }
> > 2) ! 127.750 us | } /* __handle_domain_irq */
> > 2) ! 130.084 us | } /* gic_handle_irq */
> > 2) <========== |
> > 2) ==========> |
> > 2) | gic_handle_irq() {
> > 2) 0.583 us | gic_read_iar();
> > 2) | __handle_domain_irq() {
> > 2) | irq_find_mapping() {
> > 2) 0.584 us | __rcu_read_lock();
> > 2) 0.583 us | __rcu_read_unlock();
> > 2) 2.917 us | }
> > 2) | irq_enter() {
> > 2) 0.583 us | irq_enter_rcu();
> > 2) 1.750 us | }
> > 2) | handle_percpu_devid_irq() {
> > 2) | arch_timer_handler_phys() {
> > 2) | hrtimer_interrupt() {
> > 2) | _raw_spin_lock_irqsave() {
> > 2) 0.583 us | do_raw_spin_lock();
> > 2) 2.042 us | }
> > 2) | ktime_get_update_offsets_now() {
> > 2) 0.584 us | arch_counter_read();
> > 2) 2.042 us | }
> > 2) | __hrtimer_run_queues() {
> > 2) | _raw_spin_unlock_irqrestore() {
> > 2) 1.166 us | do_raw_spin_unlock();
> > 2) 2.042 us | }
> > 2) | tick_sched_timer() {
> > 2) | ktime_get() {
> > 2) 0.583 us | arch_counter_read();
> > 2) 2.042 us | }
> > 2) | tick_sched_do_timer() {
> > 2) | tick_do_update_jiffies64.part.0() {
> > 2) | _raw_spin_lock() {
> > 2) 0.875 us | do_raw_spin_lock();
> > 2) 1.750 us | }
> > 2) | do_timer() {
> > 2) 0.583 us | calc_global_load();
> > 2) 2.333 us | }
> > 2) | _raw_spin_unlock() {
> > 2) 0.583 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) | update_wall_time() {
> > 2) | timekeeping_advance() {
> > 2) | _raw_spin_lock_irqsave() {
> > 2) 0.583 us | do_raw_spin_lock();
> > 2) 2.333 us | }
> > 2) 0.584 us | arch_counter_read();
> > 2) 0.583 us | ntp_tick_length();
> > 2) 0.584 us | ntp_tick_length();
> > 2) | timekeeping_update() {
> > 2) 0.583 us | ntp_get_next_leap();
> > 2) 0.875 us | update_vsyscall();
> > 2) 0.583 us | raw_notifier_call_chain();
> > 2) 4.375 us | }
> > 2) | _raw_spin_unlock_irqrestore()
> {
> > 2) 0.875 us | do_raw_spin_unlock();
> > 2) 1.750 us | }
> > 2) + 14.875 us | }
> > 2) + 16.042 us | }
> > 2) + 24.792 us | }
> > 2) + 25.959 us | }/
> >
> >
> >
> > 2. When I use cat /sys/kernel/debug/clk/clk_summary, the entire system
> > will hang without any output information.
> >
> >
> >
> > How should I debug these two problems?
>
> I have seen similar issues on a S32G board. It took me days to find out
> that we had unhandled SMCs. Return values were not checked in the
> kernel, the SMC was conducted, and miles later, the kernel hung up. Your
> bug has the smell that this could be the same issue. Would you please
> try the following patch to see if you have unhandled SMCs:
>
> diff --git a/hypervisor/arch/arm-common/smccc.c
> b/hypervisor/arch/arm-common/smccc.c
> index 65639b59..9b3af5b3 100644
> --- a/hypervisor/arch/arm-common/smccc.c
> +++ b/hypervisor/arch/arm-common/smccc.c
> @@ -136,6 +136,7 @@ enum trap_return handle_smc(struct trap_context *ctx)
> break;
>
> default:
> + printk("We have unhandled SMCs\n");
> ret = TRAP_UNHANDLED;
> }
>
>
>
> If you get reports of unhandled SMCs (Linux should handle this correctly
> by the way), you can, as a first workaround, simply allow to forward
> them with this patch:
>
>
> https://github.com/lfd/jailhouse/commit/3a88b0b371aeb649bc496d8c272b5d3ab5de3982
>
> Ralf
>
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Jailhouse" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to [email protected]
> > <mailto:[email protected]>.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/jailhouse-dev/204a5f33-51e3-482a-95e5-14941c87154dn%40googlegroups.com
>
> <
> https://groups.google.com/d/msgid/jailhouse-dev/204a5f33-51e3-482a-95e5-14941c87154dn%40googlegroups.com?utm_medium=email&utm_source=footer
> >.
>
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jailhouse-dev/a7c132e1-3700-49fc-b4ae-ca9557b1bf0en%40googlegroups.com.