Re: [PATCH] gianfar: Fix TX ring processing on SMP machines
From: Anton Vorontsov avoront...@ru.mvista.com Date: Wed, 3 Mar 2010 21:18:58 +0300 Starting with commit a3bc1f11e9b867a4f49505 (gianfar: Revive SKB recycling) gianfar driver sooner or later stops transmitting any packets on SMP machines. start_xmit() prepares new skb for transmitting, generally it does three things: 1. sets up all BDs (marks them ready to send), except the first one. 2. stores skb into tx_queue-tx_skbuff so that clean_tx_ring() would cleanup it later. 3. sets up the first BD, i.e. marks it ready. Here is what clean_tx_ring() does: 1. reads skbs from tx_queue-tx_skbuff 2. checks if the *last* BD is ready. If it's still ready [to send] then it it isn't transmitted, so clean_tx_ring() returns. Otherwise it actually cleanups BDs. All is OK. Now, if there is just one BD, code flow: - start_xmit(): stores skb into tx_skbuff. Note that the first BD (which is also the last one) isn't marked as ready, yet. - clean_tx_ring(): sees that skb is not null, *and* its lstatus says that it is NOT ready (like if BD was sent), so it cleans it up (bad!) - start_xmit(): marks BD as ready [to send], but it's too late. We can fix this simply by reordering lstatus/tx_skbuff writes. Reported-by: Martyn Welch martyn.we...@ge.com Bisected-by: Paul Gortmaker paul.gortma...@windriver.com Signed-off-by: Anton Vorontsov avoront...@ru.mvista.com Tested-by: Paul Gortmaker paul.gortma...@windriver.com Tested-by: Martyn Welch martyn.we...@ge.com Applied. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [Patch v.2] mpc5200b/uart: improve baud rate calculation (reach high baud rates, better accuracy)
Hi Grant: Thanks a lot for your input! [snip] Save yourself some duplicated code here. The above 14 lines can be shared between the 512x, 52xx and 5200b versions. Create yourself an internal __mpc5xxx_psc_set_divisor() function that is passed the *psc, the divisor, and the clock select register setting (both the 5200 and the 5121 have the clock select register). Hmm, yes, that's true. Will look into that. [snip] @@ -604,7 +676,6 @@ mpc52xx_uart_set_termios(struct uart_por baud = uart_get_baud_rate(port, new, old, 0, port-uartclk/16); I'm probably nitpicking, because I don't know if the io pin will handle this speed but uartclk/16 is no longer the maximum baudrate if a /4 prescaler is used. Yes, you are right. Must of course be fixed. [snip] @@ -635,8 +706,7 @@ mpc52xx_uart_set_termios(struct uart_por out_8(psc-command, MPC52xx_PSC_SEL_MODE_REG_1); out_8(psc-mode, mr1); out_8(psc-mode, mr2); - out_8(psc-ctur, ctr 8); - out_8(psc-ctlr, ctr 0xff); + psc_ops-set_divisor(port, quot); Hmmm. The divisor calculations have some tricky bits to them. I would consider changing the set_divisor() function to accept a baud rate, and modify the set_divisor function to call uart_get_divisor(). That sounds like a good idea to me. I will change the code that way. That way each set_divisor() can do whatever makes the most sense for the divisors available to it. The 5121 for example has both a /10 and a /32 divisor, plus it can use an external clock. Ouch. I don't have a 512x, but isn't the current code plain wrong then? It uses mpc5xxx_get_bus_frequency() as input for the baud rate calculation, and if the serial code assumes /16 instead of /10, the result must be terribly off. Or did I miss something here? Best, Albrecht. Tolle Dekolletés oder scharfe Tatoos? Vote jetzt ... oder mach selbst mit und zeige Deine Schokoladenseite bei Topp oder Hopp von Arcor: http://www.arcor.de/rd/footer.toh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Strange OOPS in 2.6.33
Got this OOPS a few times after coldstarting out board a few times: Unable to handle kernel paging request for unknown fault Faulting instruction address: 0xc020e2b4 Oops: Kernel access of bad area, sig: 11 [#1] TMCUTU Modules linked in: NIP: c020e2b4 LR: c020e274 CTR: REGS: c7a41b40 TRAP: 0600 Not tainted (2.6.33) MSR: 9032 EE,ME,IR,DR CR: 28002424 XER: DAR: 09f52312, DSISR: 0120 TASK = c7889940[420] 'syslogd' THREAD: c7a4 GPR00: 09f52312 c7a41bf0 c7889940 0002 c7a41c40 c02734ac c78acc68 GPR08: c7a41c00 c78acc00 09f5214b c796e3d4 1001f444 bfe78700 GPR16: bfe77400 bfe77ee0 bfe773f8 0021 0ffef130 c7a41df0 GPR24: c7a41cf0 c7a41c70 0011 7f01 09f5214a c034a5cc c7a41c00 NIP [c020e2b4] ip_dev_find+0x90/0xf0 LR [c020e274] ip_dev_find+0x50/0xf0 Call Trace: [c7a41bf0] [c020e274] ip_dev_find+0x50/0xf0 (unreliable) [c7a41c60] [c01dd86c] __ip_route_output_key+0x8d4/0xb00 [c7a41d50] [c01ddab8] ip_route_output_flow+0x1c/0xa0 [c7a41d60] [c01ff8a0] ip4_datagram_connect+0x17c/0x2b8 [c7a41e30] [c020a75c] inet_dgram_connect+0x5c/0xa8 [c7a41e50] [c01a5030] sys_connect+0x7c/0xcc [c7a41f00] [c01a6008] sys_socketcall+0x128/0x214 [c7a41f40] [c0011800] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xff6e004 LR = 0xfe2dac0 Instruction dump: bb810060 38210070 7c0803a6 4e800020 88010052 2f82 409e0028 81210054 83a90068 2f9d 419e0018 381d01c8 7d200028 31290001 7d20012d 40a2fff4 ---[ end trace 0824e85bac28e7e4 ]--- gdb says: (gdb) list *0xc020e2b4 0xc020e2b4 is in ip_dev_find (/usr/local/src/BUILD/trunk/os2kernel/arch/powerpc/include/asm/atomic.h:106). 101 102 static __inline__ void atomic_inc(atomic_t *v) 103 { 104 int t; 105 106 __asm__ __volatile__( 107 1: lwarx %0,0,%2 # atomic_inc\n\ 108 addic %0,%0,1\n 109 PPC405_ERR77(0,%2) 110stwcx. %0,0,%2 \n\ gdb) disass 0xc020e2b4 0xc020e2c4 Dump of assembler code from 0xc020e2b4 to 0xc020e2c4: 0xc020e2b4 ip_dev_find+144: lwarx r9,0,r0 0xc020e2b8 ip_dev_find+148: addic r9,r9,1 0xc020e2bc ip_dev_find+152: stwcx. r9,0,r0 0xc020e2c0 ip_dev_find+156: bne-0xc020e2b4 ip_dev_find+144 This is on a MPC8321 CPU gcc 3.4.6 Any ideas? Jocke ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/4] 8xx: Optimize TLB Miss code.
Hello Joakim, Joakim Tjernlund wrote: Could you try reverting patch: 8xx: Don't touch ACCESSED when no SWAP. and see if that makes a difference? [...] Turning on pinned TLBs(you must turn on ADVANCED_OPTIONS first) could be an improvement, regardless of my patches. here the results: run version 1-4 2.6.33-rc6 without your patches 5-8 2.6.33-rc6 with all your patches 9-122.6.33-rc6 with patches 1,2 and 4 (without 8xx: Don't touch ACCESSED when no SWAP) 13-16 2.6.33-rc6 with all your patches and CONFIG_PIN_TLB=y Turning on pinned TLBs(you must turn on ADVANCED_OPTIONS first) could be an improvement, regardless of my patches. make[1]: Entering directory `/home/hs/lmbench-3.0-a9/results' L M B E N C H 3 . 0 S U M M A R Y (Alpha software, do not distribute) Basic system parameters -- Host OS Description Mhz tlb cache mem scal pages line par load bytes - - --- - - -- tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 66 716 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 66 716 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 66 716 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 66 716 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.01001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 663216 1.01001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 662816 1.17001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 66 716 1.01001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 662816 1.04001 tqm8xxLinux 2.6.33- powerpc-linux-gnu 66 716 1.04001 Processor, Processes - times in microseconds - smaller is better -- Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc - - tqm8xxLinux 2.6.33- 66 2.97 10.3 129. 1377 272. 21.8 91.3 6949 29.K 89.K tqm8xxLinux 2.6.33- 66 3.06 10.5 124. 1375 273. 21.8 91.3 7136 30.K 89.K tqm8xxLinux 2.6.33- 66 3.06 10.6 129. 1365 272. 21.2 96.6 6889 29.K 89.K tqm8xxLinux 2.6.33- 66 3.06 10.5 124. 1309 272. 21.8 101. 6896 29.K 89.K tqm8xxLinux 2.6.33- 66 2.97 8.86 126. 1336 273. 21.7 84.2 6785 29.K 88.K tqm8xxLinux 2.6.33- 66 3.06 8.90 130. 1343 263. 21.3 84.7 7080 29.K 88.K tqm8xxLinux 2.6.33- 66 3.52 8.97 129. 1339 270. 22.4 84.4 6823 29.K 88.K tqm8xxLinux 2.6.33- 66 2.97 8.99 127. 1333 261. 22.4 87.0 7037 29.K 87.K tqm8xxLinux 2.6.33- 66 3.06 8.83 128. 1355 269. 20.7 89.2 6927 29.K 87.K tqm8xxLinux 2.6.33- 66 3.05 8.84 127. 1344 271. 21.6 90.5 6868 29.K 88.K tqm8xxLinux 2.6.33- 66 3.06 8.84 131. 1376 260. 21.4 88.1 7119 29.K 87.K tqm8xxLinux 2.6.33- 66 3.05 8.90 122. 1342 272. 21.4 88.6 6847 29.K 88.K tqm8xxLinux 2.6.33- 66 3.19 9.10 122. 1205 265. 20.9 90.3 6358 27.K 83.K tqm8xxLinux 2.6.33- 66 3.28 9.10 124. 1208 270. 20.9 95.2 6217 27.K 82.K tqm8xxLinux 2.6.33- 66 3.19 8.98 125. 1210 270. 21.1 87.9 6364 27.K 83.K tqm8xxLinux 2.6.33- 66 3.19 8.86 124. 1237 262. 21.3 90.7 6311 27.K 84.K Basic integer operations - times in nanoseconds - smaller is better --- Host OS intgr intgr intgr intgr intgr bit addmuldivmod - - -- -- -- -- -- tqm8xxLinux 2.6.33- 15.7 18.0 1.5600 124.2 203.1 tqm8xxLinux 2.6.33- 15.7 17.4 1.5800 121.1 202.8 tqm8xxLinux 2.6.33- 15.2 17.9 1.6200 124.2 202.7 tqm8xxLinux 2.6.33- 15.2 17.9 1.6000 125.0 204.0 tqm8xxLinux 2.6.33- 15.7 18.1 1.5600 124.7 204.4 tqm8xxLinux 2.6.33- 15.7 18.1 1.5800 124.2 202.8 tqm8xxLinux 2.6.33- 15.7 17.9 1.5500 124.2 203.2 tqm8xxLinux 2.6.33- 15.7 18.1 1.5500 124.5 202.0 tqm8xxLinux 2.6.33- 15.7 18.1 1.5500
Re: [PATCH 0/4] 8xx: Optimize TLB Miss code.
Dear Heiko, thanks for running the tests. In message 4b8f8bb4.6070...@denx.de you wrote: here the results: run version 1-4 2.6.33-rc6 without your patches 5-8 2.6.33-rc6 with all your patches 9-12 2.6.33-rc6 with patches 1,2 and 4 (without 8xx: Don't touch ACCESSED when no SWAP) 13-16 2.6.33-rc6 with all your patches and CONFIG_PIN_TLB=y So CONFIG_PIN_TLB imroves the performance as expected, while the other patches don;t show any measurable improvememt - or am I reading the results incorrectly? Best regards, Wolfgang Denk -- DENX Software Engineering GmbH, MD: Wolfgang Denk Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de And now remains That we find out the cause of this effect, Or rather say, the cause of this defect... -- Hamlet, Act II, Scene 2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/4] 8xx: Optimize TLB Miss code.
Wolfgang Denk w...@denx.de wrote on 2010/03/04 13:16:56: From: Wolfgang Denk w...@denx.de To: h...@denx.de Cc: Joakim Tjernlund joakim.tjernl...@transmode.se, Klaus-Jürgen heyd...@kieback-peter.de, linuxppc-...@ozlabs.org, Scott Wood scottw...@freescale.com Date: 2010/03/04 13:17 Subject: Re: [PATCH 0/4] 8xx: Optimize TLB Miss code. Dear Heiko, thanks for running the tests. In message 4b8f8bb4.6070...@denx.de you wrote: here the results: run version 1-4 2.6.33-rc6 without your patches 5-8 2.6.33-rc6 with all your patches 9-12 2.6.33-rc6 with patches 1,2 and 4 (without 8xx: Don't touch ACCESSED when no SWAP) 13-16 2.6.33-rc6 with all your patches and CONFIG_PIN_TLB=y So CONFIG_PIN_TLB imroves the performance as expected, while the other patches don;t show any measurable improvememt - or am I reading the results incorrectly? Close but not quite. What stands out most is: Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) -- Host OS Mhz L1 $ L2 $Main memRand memGuesses - - --- --- tqm8xxLinux 2.6.33-66 31.8 141.0 184.0 1165.7 tqm8xxLinux 2.6.33-66 31.8 141.2 184.2 1165.3 tqm8xxLinux 2.6.33-66 31.8 141.3 184.3 1165.6 tqm8xxLinux 2.6.33-66 31.8 141.3 184.2 1166.2 tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1100.5No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1102.5No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1101.7No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1101.6No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.1 173.4 1149.1No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.1 173.4 1149.0No L2 cache? tqm8xxLinux 2.6.33-66 31.7 141.1 173.4 1148.7No L2 cache? tqm8xxLinux 2.6.33-66 31.7 141.1 173.4 1148.2No L2 cache? tqm8xxLinux 2.6.33-66 31.8 171.1 171.7 1099.8No L2 cache? tqm8xxLinux 2.6.33-66 31.8 171.1 171.6 1100.5No L2 cache? tqm8xxLinux 2.6.33-66 31.7 171.0 171.7 1101.0No L2 cache? tqm8xxLinux 2.6.33-66 31.8 171.0 171.6 1101.3No L2 cache? Besides the numbers, note how the first group doesn't have a Guesses entry. Is there something odd with the results for the first group? Also, since you are using MODULES, patch 2 is nullified. Patch 1 is very minor and should not show I think. This leaves patches 3 4. There appears to be something funny with patch 3,Don't touch ACCESSED when no SWAP, as it yields bad numbers for Prot Fault so perhaps I am missing something that needs ACCESSED even if NO_SWAP. Perhaps a someone that knows MM in Linux knows? Is there any messages in the kernel log(dmesg)? Jocke ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [Patch v.2] mpc5200b/uart: improve baud rate calculation (reach high baud rates, better accuracy)
On Thu, Mar 4, 2010 at 2:56 AM, Albrecht Dreß albrecht.dr...@arcor.de wrote: That way each set_divisor() can do whatever makes the most sense for the divisors available to it. The 5121 for example has both a /10 and a /32 divisor, plus it can use an external clock. Ouch. I don't have a 512x, but isn't the current code plain wrong then? It uses mpc5xxx_get_bus_frequency() as input for the baud rate calculation, and if the serial code assumes /16 instead of /10, the result must be terribly off. Or did I miss something here? If you are, then I'm missing the same thing. Do you best to keep the 5121 calculation work out to the same value it uses now. We'll ask someone with a 5121 to test it out before I add the patch to my -next branch. g. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 0/4] 8xx: Optimize TLB Miss code.
Hello Joakim, Joakim Tjernlund wrote: Wolfgang Denk w...@denx.de wrote on 2010/03/04 13:16:56: From: Wolfgang Denk w...@denx.de To: h...@denx.de Cc: Joakim Tjernlund joakim.tjernl...@transmode.se, Klaus-Jürgen heyd...@kieback-peter.de, linuxppc-...@ozlabs.org, Scott Wood scottw...@freescale.com Date: 2010/03/04 13:17 Subject: Re: [PATCH 0/4] 8xx: Optimize TLB Miss code. Dear Heiko, thanks for running the tests. In message 4b8f8bb4.6070...@denx.de you wrote: here the results: run version 1-4 2.6.33-rc6 without your patches 5-8 2.6.33-rc6 with all your patches 9-12 2.6.33-rc6 with patches 1,2 and 4 (without 8xx: Don't touch ACCESSED when no SWAP) 13-16 2.6.33-rc6 with all your patches and CONFIG_PIN_TLB=y So CONFIG_PIN_TLB imroves the performance as expected, while the other patches don;t show any measurable improvememt - or am I reading the results incorrectly? Close but not quite. What stands out most is: Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) -- Host OS Mhz L1 $ L2 $Main memRand memGuesses - - --- --- tqm8xxLinux 2.6.33-66 31.8 141.0 184.0 1165.7 tqm8xxLinux 2.6.33-66 31.8 141.2 184.2 1165.3 tqm8xxLinux 2.6.33-66 31.8 141.3 184.3 1165.6 tqm8xxLinux 2.6.33-66 31.8 141.3 184.2 1166.2 tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1100.5No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1102.5No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1101.7No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.0 171.8 1101.6No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.1 173.4 1149.1No L2 cache? tqm8xxLinux 2.6.33-66 31.8 141.1 173.4 1149.0No L2 cache? tqm8xxLinux 2.6.33-66 31.7 141.1 173.4 1148.7No L2 cache? tqm8xxLinux 2.6.33-66 31.7 141.1 173.4 1148.2No L2 cache? tqm8xxLinux 2.6.33-66 31.8 171.1 171.7 1099.8No L2 cache? tqm8xxLinux 2.6.33-66 31.8 171.1 171.6 1100.5No L2 cache? tqm8xxLinux 2.6.33-66 31.7 171.0 171.7 1101.0No L2 cache? tqm8xxLinux 2.6.33-66 31.8 171.0 171.6 1101.3No L2 cache? Besides the numbers, note how the first group doesn't have a Guesses entry. Is there something odd with the results for the first group? Hmm.. just to be safe, I made this test again, but it shows also no entry in Guesses ... Hardware, Linux Source, rootFS, lmbench sources, all the same ... Also, since you are using MODULES, patch 2 is nullified. Patch 1 is very minor and should not show I think. This leaves patches 3 4. There appears to be something funny with patch 3,Don't touch ACCESSED when no SWAP, as it yields bad numbers for Prot Fault so perhaps I am missing something that needs ACCESSED even if NO_SWAP. Perhaps a someone that knows MM in Linux knows? Is there any messages in the kernel log(dmesg)? I couldn;t find something in the output with dmesg ... but if you want this output, I can send it to you. bye Heiko -- DENX Software Engineering GmbH, MD: Wolfgang Denk Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] gianfar: Fix TX ring processing on SMP machines
On Mar 4, 2010, at 2:41 AM, David Miller wrote: From: Anton Vorontsov avoront...@ru.mvista.com Date: Wed, 3 Mar 2010 21:18:58 +0300 Starting with commit a3bc1f11e9b867a4f49505 (gianfar: Revive SKB recycling) gianfar driver sooner or later stops transmitting any packets on SMP machines. start_xmit() prepares new skb for transmitting, generally it does three things: 1. sets up all BDs (marks them ready to send), except the first one. 2. stores skb into tx_queue-tx_skbuff so that clean_tx_ring() would cleanup it later. 3. sets up the first BD, i.e. marks it ready. Here is what clean_tx_ring() does: 1. reads skbs from tx_queue-tx_skbuff 2. checks if the *last* BD is ready. If it's still ready [to send] then it it isn't transmitted, so clean_tx_ring() returns. Otherwise it actually cleanups BDs. All is OK. Now, if there is just one BD, code flow: - start_xmit(): stores skb into tx_skbuff. Note that the first BD (which is also the last one) isn't marked as ready, yet. - clean_tx_ring(): sees that skb is not null, *and* its lstatus says that it is NOT ready (like if BD was sent), so it cleans it up (bad!) - start_xmit(): marks BD as ready [to send], but it's too late. We can fix this simply by reordering lstatus/tx_skbuff writes. Reported-by: Martyn Welch martyn.we...@ge.com Bisected-by: Paul Gortmaker paul.gortma...@windriver.com Signed-off-by: Anton Vorontsov avoront...@ru.mvista.com Tested-by: Paul Gortmaker paul.gortma...@windriver.com Tested-by: Martyn Welch martyn.we...@ge.com Applied. Anton, Once this makes it into Linus's tree can you make sure we get it added to -stable. - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc: Renaming following split of GE Fanuc joint venture
On Mar 1, 2010, at 8:41 AM, Martyn Welch wrote: This patch renames GE Fanuc boards following the split-up of the GE Fanuc joint venture. These boards are now made by GE Intelligent platorms. Signed-off-by: Martyn Welch martyn.we...@gefanuc.com --- arch/powerpc/boot/dts/gef_ppc9a.dts |4 ++-- arch/powerpc/boot/dts/gef_sbc310.dts |4 ++-- arch/powerpc/boot/dts/gef_sbc610.dts |4 ++-- arch/powerpc/platforms/86xx/Kconfig | 12 ++-- arch/powerpc/platforms/86xx/gef_gpio.c | 10 +- arch/powerpc/platforms/86xx/gef_pic.c|6 +++--- arch/powerpc/platforms/86xx/gef_ppc9a.c | 12 ++-- arch/powerpc/platforms/86xx/gef_sbc310.c | 12 ++-- arch/powerpc/platforms/86xx/gef_sbc610.c | 12 ++-- 9 files changed, 38 insertions(+), 38 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 44/66] arch/powerpc/sysdev/cpm2_pic.h: Checkpatch cleanup
On Feb 27, 2010, at 10:51 AM, Andrea Gelmini wrote: arch/powerpc/sysdev/cpm2_pic.h:6: ERROR: (foo*) should be (foo *) Signed-off-by: Andrea Gelmini andrea.gelm...@gelma.net --- arch/powerpc/sysdev/cpm2_pic.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] perf_event: Build callchain code regardless of hardware event support.
On Feb 25, 2010, at 6:09 PM, Paul Mackerras wrote: On Thu, Feb 25, 2010 at 06:04:33PM -0600, Scott Wood wrote: It's also useful for software events, as well as future support for other types of hardware counters. Signed-off-by: Scott Wood scottw...@freescale.com Acked-by: Paul Mackerras pau...@samba.org applied to next - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] perf_event: e500 support
On Feb 25, 2010, at 6:09 PM, Scott Wood wrote: This implements perf_event support for the Freescale embedded performance monitor, based on the existing perf_event.c that supports server/classic chips. Some limitations: - Performance monitor interrupts are regular EE interrupts, and thus you can't profile places with interrupts disabled. We may want to implement soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs. - When trying to schedule multiple event groups at once, and using restricted events, situations could arise where scheduling fails even though it would be possible. Consider three groups, each with two events. One group has restricted events, the others don't. The two non-restricted groups are scheduled, then one is removed, which happens to occupy the two counters that can't do restricted events. The remaining non-restricted group will not be moved to the non-restricted-capable counters to make room if the restricted group tries to be scheduled. Signed-off-by: Scott Wood scottw...@freescale.com --- Changes from previous version: - Factored out callchain makefile patch - Split up header files - Renamed pmu struct - Added threshold support arch/powerpc/include/asm/perf_event.h | 133 + arch/powerpc/include/asm/perf_event_fsl_emb.h | 50 ++ .../asm/{perf_event.h = perf_event_server.h} |4 +- arch/powerpc/include/asm/reg_fsl_emb.h |2 +- arch/powerpc/kernel/Makefile |4 + arch/powerpc/kernel/cputable.c |2 +- arch/powerpc/kernel/e500-pmu.c | 129 arch/powerpc/kernel/perf_event_fsl_emb.c | 654 arch/powerpc/platforms/Kconfig.cputype | 10 + 9 files changed, 874 insertions(+), 114 deletions(-) rewrite arch/powerpc/include/asm/perf_event.h (92%) create mode 100644 arch/powerpc/include/asm/perf_event_fsl_emb.h rename arch/powerpc/include/asm/{perf_event.h = perf_event_server.h} (98%) create mode 100644 arch/powerpc/kernel/e500-pmu.c create mode 100644 arch/powerpc/kernel/perf_event_fsl_emb.c Paul do you intend to Ack this or don't care? - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[Patch v.3] mpc5200b/uart: improve baud rate calculation (reach high baud rates, better accuracy)
On the MPC5200B, make very high baud rates (e.g. 3 MBaud) accessible and achieve a higher precision for high baud rates in general. This is done by selecting the appropriate prescaler (/4 or /32). As to keep the code clean, the getuartclk method has been dropped, and all calculations are done in a new set_baudrate method. Notes: only fsl,mpc5200b-psc-uart compatible devices benefit from these improvements. The 512x may or may not work; the patch keeps the current implementation (using a /16 prescaler), but according to the data sheet, this is plain wrong. See the comment in mpc512x_psc_set_baudrate(). Any insight and testing of the code would be appreciated. Tested on a custom 5200B based board, from 110 baud up to 3 MBaud, and with both fsl,mpc5200b-psc-uart and fsl,mpc5200-psc-uart devices. Signed-off-by: Albrecht Dreß albrecht.dr...@arcor.de --- Changes vs. v.2: Pick up Grant's comments by shifting the calculations to the new set_baudrate method. --- linux-2.6.33-orig/drivers/serial/mpc52xx_uart.c 2010-02-24 19:52:17.0 +0100 +++ linux-2.6.33/drivers/serial/mpc52xx_uart.c 2010-03-04 17:13:47.0 +0100 @@ -144,9 +144,21 @@ struct psc_ops { unsigned char (*read_char)(struct uart_port *port); void(*cw_disable_ints)(struct uart_port *port); void(*cw_restore_ints)(struct uart_port *port); - unsigned long (*getuartclk)(void *p); + unsigned int(*set_baudrate)(struct uart_port *port, + struct ktermios *new, + struct ktermios *old); }; +/* setting the prescaler and divisor reg is common for all chips */ +static inline void mpc52xx_set_divisor(struct mpc52xx_psc __iomem *psc, + u16 prescaler, unsigned int divisor) +{ + /* select prescaler */ + out_be16(psc-mpc52xx_psc_clock_select, prescaler); + out_8(psc-ctur, divisor 8); + out_8(psc-ctlr, divisor 0xff); +} + #ifdef CONFIG_PPC_MPC52xx #define FIFO_52xx(port) ((struct mpc52xx_psc_fifo __iomem *)(PSC(port)+1)) static void mpc52xx_psc_fifo_init(struct uart_port *port) @@ -154,9 +166,6 @@ static void mpc52xx_psc_fifo_init(struct struct mpc52xx_psc __iomem *psc = PSC(port); struct mpc52xx_psc_fifo __iomem *fifo = FIFO_52xx(port); - /* /32 prescaler */ - out_be16(psc-mpc52xx_psc_clock_select, 0xdd00); - out_8(fifo-rfcntl, 0x00); out_be16(fifo-rfalarm, 0x1ff); out_8(fifo-tfcntl, 0x07); @@ -245,15 +254,47 @@ static void mpc52xx_psc_cw_restore_ints( out_be16(PSC(port)-mpc52xx_psc_imr, port-read_status_mask); } -/* Search for bus-frequency property in this node or a parent */ -static unsigned long mpc52xx_getuartclk(void *p) -{ - /* -* 5200 UARTs have a / 32 prescaler -* but the generic serial code assumes 16 -* so return ipb freq / 2 -*/ - return mpc5xxx_get_bus_frequency(p) / 2; +static unsigned int mpc5200_psc_set_baudrate(struct uart_port *port, +struct ktermios *new, +struct ktermios *old) +{ + unsigned int baud; + unsigned int divisor; + + /* The 5200 has a fixed /32 prescaler, uartclk contains the ipb freq */ + baud = uart_get_baud_rate(port, new, old, + port-uartclk / (32 * 0x) + 1, + port-uartclk / 32); + divisor = (port-uartclk + 16 * baud) / (32 * baud); + + /* enable the /32 prescaler and set the divisor */ + mpc52xx_set_divisor(PSC(port), 0xdd00, divisor); + return baud; +} + +static unsigned int mpc5200b_psc_set_baudrate(struct uart_port *port, + struct ktermios *new, + struct ktermios *old) +{ + unsigned int baud; + unsigned int divisor; + u16 prescaler; + + /* The 5200B has a selectable /4 or /32 prescaler, uartclk contains the +* ipb freq */ + baud = uart_get_baud_rate(port, new, old, + port-uartclk / (32 * 0x) + 1, + port-uartclk / 4); + divisor = (port-uartclk + 2 * baud) / (4 * baud); + + /* select the proper prescaler and set the divisor */ + if (divisor 0x) { + divisor = (divisor + 4) / 8; + prescaler = 0xdd00; /* /32 */ + } else + prescaler = 0xff00; /* /4 */ + mpc52xx_set_divisor(PSC(port), prescaler, divisor); + return baud; } static struct psc_ops mpc52xx_psc_ops = { @@ -272,7 +313,26 @@ static struct psc_ops mpc52xx_psc_ops = .read_char = mpc52xx_psc_read_char, .cw_disable_ints = mpc52xx_psc_cw_disable_ints, .cw_restore_ints = mpc52xx_psc_cw_restore_ints, - .getuartclk = mpc52xx_getuartclk, +
Re: [RFC: PATCH 08/13] powerpc/476: define specific cpu table entry for DD1 and DD1.1 cores
On Mon, Mar 1, 2010 at 11:13 AM, Dave Kleikamp sha...@linux.vnet.ibm.comwrote: powerpc/476: define specific cpu table entry for DD1 and DD1.1 cores From: Benjamin Herrenschmidt b...@kernel.crashing.org There are still some unstable bits on the DD1 and DD1.1 cores. Don't use the FPU or the tlbivax operation. Define CPU_FTR_476_DD1 and CPU_FTR_476_DD1_1 for additional workarounds in later patches. The DD1 core requires workarounds triggered by both CPU_FTR_476_DD1 and CPU_FTR_476_DD1_1. the DD1.1 core only needs CPU_FTR_476_DD1_1 defined. Isn't the policy generally not to commit workarounds for early/errataful hardware which will not be seen in the real world? Otherwise, every new half-broken core could burn a bunch of feature bits... -Hollis ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCHv4 2/2] powerpc: implement arch_scale_smt_power for Power7
In message 1267541076.25158.60.ca...@laptop you wrote: On Sat, 2010-02-27 at 21:21 +1100, Michael Neuling wrote: In message 11927.1267010...@neuling.org you wrote: If there's less the group will normally be balanced and we fall out a nd end up in check_asym_packing(). So what I tried doing with that loop is detect if there's a hole in t he packing before busiest. Now that I think about it, what we need to ch eck is if this_cpu (the removed cpu argument) is idle and less than busie st. So something like: static int check_asym_pacing(struct sched_domain *sd, struct sd_lb_stats *sds, int this_cpu, unsigned long *imbalance) { int busiest_cpu; if (!(sd-flags SD_ASYM_PACKING)) return 0; if (!sds-busiest) return 0; busiest_cpu = group_first_cpu(sds-busiest); if (cpu_rq(this_cpu)-nr_running || this_cpu busiest_cpu) return 0; *imbalance = (sds-max_load * sds-busiest-cpu_power) / SCHED_LOAD_SCALE; return 1; } Does that make sense? I think so. I'm seeing check_asym_packing do the right thing with the simple SMT2 with 1 process case. It marks cpu0 as imbalanced when cpu0 is idle and cpu1 is busy. Unfortunately the process doesn't seem to be get migrated down though. Do we need to give *imbalance a higher value? So with ego help, I traced this down a bit more. In my simple test case (SMT2, t0 idle, t1 active) if f_b_g() hits our new case in check_asym_packing(), load_balance then runs f_b_q(). f_b_q() has this: if (capacity rq-nr_running == 1 wl imbalance) continue; when check_asym_packing() hits, wl = 1783 and imbalance = 1024, so we continue and busiest remains NULL. load_balance then does goto out_balanced and it doesn't attempt to move the task. Based on this and on egos suggestion I pulled in Suresh Siddha patch from: http://lkml.org/lkml/2010/2/12/352. This fixes the problem. The process is moved down to t0. I've only tested SMT2 so far. I'm finding this SMT2 result to be unreliable. Sometimes it doesn't work for the simple 1 process case. It seems to change boot to boot. Sometimes it works as expected with t0 busy and t1 idle, but other times it's the other way around. When it doesn't work, check_asym_packing() is still marking processes to be pulled down but only gets run about 1 in every 4 calls to load_balance(). For 2 of the other calls to load_balance, idle is CPU_NEWLY_IDLE and hence check_asym_packing() doesn't get called. This results in sd-nr_balance_failed being reset. When load_balance is next called and check_asym_packing() hits, need_active_balance() returns 0 as sd-nr_balance_failed is too small. This means the migration thread on t1 is not woken and the process remains there. So why does thread0 change from NEWLY_IDLE to IDLE and visa versa, when there is nothing running on it? Is this expected? Ah, yes, you should probably allow both those. NEWLY_IDLE is when we are about to schedule the idle thread, IDLE is when a tick hits the idle thread. I'm thinking that NEWLY_IDLE should also solve the NO_HZ case, since we'll have passed through that before we enter tickless state, just make sure SD_BALANCE_NEWIDLE is set on the relevant levels (should already be so). OK, thanks. There seems to be a regression in Linus' latest tree (also -next) where new processes usually end up on the thread 1 rather than 0 (when in SMT2 mode). This only seems to happen with newly created processes. If you pin a process to t0 and then unpin it, it stays on t0. Also if a process is migrated to another core, it can end up on t0. This happens with a vanilla linus or -next tree on ppc64 pseries_defconfig - NO_HZ. I've not tried with NO_HZ. Anyway, this regression seems to be causing problems when we apply our patch. We are trying to pull down to T0 which works, but we immediately get pulled back upto t1 due to the above regression. This happens over and over, causing process to ping-pong every few sched ticks. We've not tried to bisect this problem but that's the next step unless someone has some insights to the problem. Also, we had to change the following to get the pull down to work correctly in the original patch: @@ -2618,8 +2618,8 @@ static int check_asym_packing(struct sch if (this_cpu busiest_cpu) return 0; - *imbalance = (sds-max_load * sds-busiest-cpu_power) / - SCHED_LOAD_SCALE; + *imbalance = DIV_ROUND_CLOSEST(sds-max_load * sds-busiest-cpu_power, +
Re: [PATCH 2/2] perf_event: e500 support
On Thu, Mar 04, 2010 at 10:48:03AM -0600, Kumar Gala wrote: Paul do you intend to Ack this or don't care? Sorry, thought I had. Acked-by: Paul Mackerras pau...@samba.org ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/7] powerpc/85xx: Fix the RapidIO maintenance access functions
Quoting Micha Nelissen mi...@neli.hopto.org: Bounine, Alexandre wrote: Hi Micha, I tested it on my setup - it works. Maybe Thomas may give more details on this change. Did you (for fun) try once to decrease the maintenance window to say, 4 kB? Then you really need these high bits to work properly. We have never changed the configuration of the window size in Linux, but we have an test application that uses these 4kb window setting. With the current window configuration in Linux, I expect problems when someone tries to read a register that is located at offset 512KB. I have currently no equipment to verify this behaviour. Or did you try with some register at offset 4MB? The Tundra's have registers going up to 0x14000 or so? So don't need 16MB addressing for that. We have devices that requires access to registers that are located at offset 15MB. Thanks, Micha -Original Message- From: Micha Nelissen [mailto:mi...@neli.hopto.org] Sent: Wednesday, February 24, 2010 3:21 PM To: Alexandre Bounine Subject: Re: [PATCH 7/7] powerpc/85xx: Fix the RapidIO maintenance access functions Alexandre Bounine wrote: out_be32(priv-maint_atmu_regs-rowtar, -(destid 22) | (hopcount 12) | ((offset ~0x3) 9)); +(destid 22) | (hopcount 12) | (offset 12)); + out_be32(priv-maint_atmu_regs-rowtear, (destid 10)); Did this actually work for you? The (offset 12) is due to the 4MB window size right? Micha ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev