Hesham, Thank you for the response and the tips
I tried out the master version of the sel4-tests and still don't seem to be able to complete all of the tests. Looks like it crashed after the TEST_FPU0002 I still have some debugging to do, but a better idea of where to go now thanks. Jesse On 2017-08-08 07:56 PM, [email protected] wrote: > Hi Jesse > > > On 09/08/17 06:45, Jesse Millwood wrote: >> Hello, >> >> I realize that the only officially supported SMP platform is the i.mx6 >> but I did see some code on 6.0 to master for SMP and the zynq7000 so I >> am trying to test out some SMP functionality on my zc702 board. >> >> I am wondering if anyone has this working because my system is failing >> when compiling from master and the 6.0 and compatible tags. >> >> >> > 6.0.x release doesn't support SMP on Zynq, it's supposed to be in the > next release. However, you can still use master branches to get SMP/Zynq > work. Another option is to just use the master branch in the elfloader > since it fixes SMP related bugs and adds a reset functionality to reset > zynq's secondary core. > > From the config you provided, it seems like you enabled printing and > debug mode. Zynq/SMP has only been tested in release mode on > sel4test/zc706. This means it might not work if you enabled printing > and/or debug mode. Furthermore, we haven't tested it on camkes-based > projects yet. > > From your thorough output (thanks!), everything seems to be fine; core1 > should be going to restore_user_context and proceed with the idleThread > after releasing the lock, and core0 should get it, and go to the root > thread. > > I'd suggest you try disabling printing/debug mode and run on > sel4test/master (bamboo_zynq7000_smp_release_xml_defconfig), just to > make sure this works on your board. > > > Please let us know if you still have the same issue. > > Cheers, > Hesham > >> My second core seems to be coming up but the system ultimately fails >> and prints out: >> >> Bo >> >> ot >> >> nKgE RaNlEl L fDinAiTsAh eAdBO, RdT!r >> >> ppFaeud lttio nugs eirn stspraucctei o >> >> n: 0xe001d1b0 o >> >> FAR: 0xfffffff8 DFSR: 0x807 >> >> halting... >> >> Kernel entry via Syscall, number: 1, Call >> >> Cap type: 1, Invocation tag: 37 >> >> Which seems to be “Booting all finished, dropped to user space” from >> core0 and “KERNEL DATA ABORT!” from core1. >> >> The “0xe001d1b0” seems to be the label of the “idle_thread” function. >> >> >> >> While stepping through via JTAG, I have verified that core1 gets >> through “init_kernel” and then enters “restore_user_context” at some >> point in “restore_user_context” the fault registers as shown in the >> printed output are set. I think it is either in the c_exit_hook in >> restore_user_context or after the program branches to “0xFFF0010” >> which is “ldr pc,0xFFFF0030”. This branches to the >> “arm_data_abort_exception” label, which goes to “kernel_data_fault” >> label and then to “kernel data abort”. >> >> I’m having trouble exactly pin pointing where the fault occurs but it >> seems to be close to there. >> >> Has anyone had similar issues with SMP?It seems to get fairly far >> without setting the fault registers. >> >> >> >> I have tried to step through the execution over JTAG and here are some >> of my (verbose) notes >> >> >> >> | CORE0 Address | Core0 Function | Core0 >> Instruction | CORE1 Address | Core1 Function >> | Core1 Instruction | DFSR | DFAR | >> Note | >> >> >> |---------------+-----------------------------+------------------------+---------------+------------------------------------+----------------------+------------+------------+------| >> >> | 0x10000000 | label: start | =cpsid >> aif= | 0xFFFFFF34 | | >> =mvn r0,#0x0f= | 0x00000000 | 0x00005000 | | >> >> | 0x10003A2C | call: platform_init | =bl >> -x10003DD8= | 0xFFFFFF30 >> | | =wfe= | >> 0x00000000 | 0x00005000 | | >> >> | 0x10003ACC | call: smp_boot | =bl >> 0x100039FC= | 0xFFFFFF34 >> | | =mvn r0,#0x0f= | >> 0x00000000 | 0x00005000 | | >> >> | 0x10003ADO | ret: smp_boot | =bl >> 0x10005C54= | 0x10000020 | in: >> non_boot_core | =orr r0,r0,#0x40= | 0x00000000 | >> 0x00005000 | 2 | >> >> | 0x10003ADC | =if(is_hyp_mode())= | =beq >> 0x10003AF0= | 0x10002200 | label: arm_disable_dcaches >> | =push {r14}= | 0x00000000 | 0x00005000 | | >> >> | 0x10003AFC | call: arm_enable_mmu | =bl >> 0x10002174= | 0xE0006190 | in: >> try_init_kernel_secondary_core | =beq 0xE0001680= | 0x00000000 | >> 0x00005000 | 1 | >> >> | 0xE0001D70 | label: init_kernel | =push >> {r11,r14}= | 0xE0001680 | in: try_init_kernel_secondary_core >> | =beq 0xE0001680= | 0x00000000 | 0x00005000 | 1 | >> >> | 0xE0001814 | label: try_init_kernel | =push >> {r11,r14}= | 0xE0001690 | in: try_init_kernel_secondary_core >> | =beq 0xE0001680= | 0x00000000 | 0x00005000 | 1 | >> >> | 0xE0001B80 | call: create_initial_thread | =str r0, >> [r11,#-0x14]= | 0xE0001680 | in: try_init_kernel_secondary_core | >> =beq 0xE0001680= | 0x00000000 | 0x00005000 | 1 | >> >> | 0xE0001C48 | call: SMP_COND_STATEMENT | =bl >> 0xE0003C20= | 0xE0001680 | in: >> try_init_kernel_secondary_core | =beq 0xE0001680= | 0x00000000 | >> 0x00005000 | 1, 3 | >> >> | 0xE0001C4C | call: SMP_COND_STATEMENT | =bl >> 0xE00017D8= | 0xE0001680 | in: >> try_init_kernel_secondary_core | =beq 0xE0001680= | 0x00000000 | >> 0x00005000 | 1, 4 | >> >> | 0xE0001C50 | NODE_LOCK_SYS | =bl >> 0xE0019280= | 0xE0019288 | in: >> getCurrentCPUIndex | =sub r13,r13,#0x8= | 0x00000000 | >> 0x00005000 | 5 | >> >> | 0xE0001D1C | call: arch_pause | =bl >> 0xE0019DB0= | 0xE0019290 | in: >> getCurrentCPUIndex | =str r0,[r11,#-0x8]= | 0x00000000 | >> 0x00005000 | 6 | >> >> | 0xE0001D40 | in: clh_lock_acquire | =uxtb >> r3,r3= | 0xE00037F4 | in: init_core_state >> | =pop {r4,r11,pc}= | 0x00000000 | 0x00005000 | | >> >> | 0xE0001D20 | in: clh_lock_acquire | =mov >> r2,#0xE800= | 0xE0003754 | in: init_core_state >> | =movw r2,#0xE8E0= | 0x00000000 | 0x00005000 | 7 | >> >> | 0xE0001D40 | in: clh_lock_acquire | =uxtb >> r3,r3= | 0xE00017C4 | in: try_init_kernel_secondary >> | =mov r3,#0x1= | 0x00000000 | 0x00005000 | 8 | >> >> | 0xE0001D40 | in: clh_lock_acquire | =uxtb >> r3,r3= | 0xE002A06C | in: schedule >> | =push {r11,r14}= | 0x00000000 | 0x00005000 | 9 | >> >> | 0xE0001D40 | in: clh_lock_acquire | =uxtb >> r3,r3= | 0xE002979C | in: activateThread >> | =push {r11, r14}= | 0x00000000 | 0x00005000 | 10 | >> >> | 0xE0001D40 | in: clh_lock_acquire | =uxtb >> r3,r3= | 0xE001D24C | label: Arch_activateIdleThread >> | =push {r11}= | 0x00000000 | 0x00005000 | 11 | >> >> | 0xE0001D38 | in: clh_lock_acquire | =ldr >> r3,[r3,#0x4]= | 0xE0000054 | in: start >> | =b 0xE001CEC8= | 0x00000000 | 0x00005000 | 12 | >> >> >> >> >> >> Notes >> >> 1. Core1 is in a =while (!node_boot_lock)= loop >> >> 2. In =smp_boot=, CORE1 changes after =init_cpus= (branch >> location: ZSR:10003A08) >> >> - In =smp_boot=, =boot_cpus= is called >> >> - This sets the =CPU_JUMP_PTR= =*((volatile >> uint32_t*)CPU_JUMP_PTR) = (uint32_t)entry;= >> >> - calls =dsb= (data synchronization barrier) >> >> - After this call, CPU1 goes to =FFFFFF2C: dsb sy= >> >> - And then =sev= >> >> - After this call, CPU1 goes to the =non_boot_core= label >> >> - SEV >> >> - SEV causes an event to be signaled to all cores >> within a multiprocessor system. If SEV is implemented, WFE must also >> be implemented. >> >> - WFE >> >> - If the Event Register is not set, WFE suspends >> execution until one of the following events occurs: >> >> - an IRQ interrupt, unless masked by the CPSR I-bit >> >> - an FIQ interrupt, unless masked by the CPSR F-bit >> >> - an Imprecise Data abort, unless masked by the >> CPSR A-bit >> >> - a Debug Entry request, if Debug is enabled >> >> - an Event signaled by another processor using the >> SEV instruction. >> >> - If the Event Register is set, WFE clears it and returns >> immediately. >> >> - If WFE is implemented, SEV must also be implemented. >> >> - After CPU0 executes =arm_enable_mmu()= from the =main= function >> >> - by the end of =smp_boot= core1 is just starting =non_boot_main= >> >> 3. The =SMP_COND_STATEMENT= is calling =clh_lock_init= >> >> 4. The =SMP_COND_STATEMENT= is calling =release_secondary_cpus= >> >> 5. right after Core0 returned from releasing secondary cpus >> >> - First time Core1 has exited the loop >> >> - Core1's stack is >> >> - =getCurrentCPUINdex= >> >> - =init_core_state= >> >> - =try_init_kernel_secondary_core= >> >> - =init_kernel= >> >> 6. Core0 is in a >> =while(big_kernel_lock.node_owners[cpu].next->value ! = CLHState_Granted)= >> >> - Core0 is in a static inline function =clh_lock_acquire= in >> =try_init_kernel= >> >> - Core1 is in =getCurrentCPUIndex= but being called from >> =tcbDebugAppend= >> >> - =tcbDebugAppend= is being called from a =for= loop in >> =init_core_state= >> >> 7. Core0 is still in the previously mentioned while loop >> >> - Core1 is in "init_core_state" and has exited the for loop >> that called =tcbDebugAppend(NODE_STATE_ON_CORE(ksIdleThread, i))= >> >> 8. Core0 is still in the previously mentioned while loop >> >> - Core1 has returned to =try_init_kernel_secondary_core= from >> =init_core_state= and is at the end of the function >> >> 9. Core0 is still in the previously mentioned while loop >> >> - Core1 has entered the =init_kernel= call and then the >> =schedule= function. >> >> 10. Core0 is still in the previously mentioned while loop >> >> - Core1 has entered the =activateThread= call after >> =schedule= in =init_kernel= >> >> 11. Core0 is still in the previously mentioned while loop >> >> - Core1 seems to have dropped into the =case >> ThreadState_IdleThreadState:= case when switching on =switch >> (thread_state_get_tsType(NODE_STATE(ksCurThread)->tcbState))= >> >> - This was in =activateThread= >> >> 12. Core0 is still in the previously mentioned while loop >> >> - Core1 has exited =init_kernel= and is now branching >> to =restore_user_context= >> >> >> >> >> >> To test everything out I am using the “camkes-sols-master” manifest >> and building the “CAmkES Hello World application with events and >> dataports”. >> >> The changes I made are >> >> · I edited the top level CAmkES file to set the affinity for >> two separate cores. >> >> · Upped the Max Number of CPU nodes to 2 >> >> The rest of the config is pretty standard. I have it attached to this >> message. >> >> The FSBL and ps7_init script I use are the standard ones created for >> the zc702 from the 2017.2 version of the Xilinx XSDK. >> >> I am booting from jtag and first run the ps7_init script and then >> flash the fsbl and then the >> “capdl-loader-experimental-image-arm-zynq7000” that was built. >> >> >> >> I am wondering if anyone is using a modified fsbl or ps7_init that >> does something else, if there is config value that I missed, or if it >> is still in development? If it is still in development I’d like to >> work with whoever is >> >> >> >> Thanks, >> >> Jesse Millwood >> >> >> >> _______________________________________________ >> Devel mailing list >> [email protected] >> https://sel4.systems/lists/listinfo/devel > _______________________________________________ > Devel mailing list > [email protected] > https://sel4.systems/lists/listinfo/devel _______________________________________________ Devel mailing list [email protected] https://sel4.systems/lists/listinfo/devel
