Re: [PATCH] RISC-V: Allow drivers to provide custom read_cycles64 for M-mode kernel
Hi Palmer, On Sat, Sep 5, 2020 at 9:14 AM Anup Patel wrote: > > On Sat, Sep 5, 2020 at 6:47 AM Palmer Dabbelt > wrote: > > > > On Fri, 04 Sep 2020 09:57:09 PDT (-0700), Christoph Hellwig wrote: > > > On Fri, Sep 04, 2020 at 10:13:18PM +0530, Anup Patel wrote: > > >> I respectfully disagree. IMHO, the previous code made the RISC-V > > >> timer driver convoluted (both SBI call and CLINT in one place) and > > >> mandated CLINT for NoMMU kernel. In fact, RISC-V spec does not > > >> mandate CLINT or PLIC. The RISC-V SOC vendors are free to > > >> implement their own timer device, IPI device and interrupt controller. > > > > > > Yes, exactly what we need is everyone coming up with another stupid > > > non-standard timer and irq driver. > > > > Well, we don't have a standard one so there's really no way around people > > coming up with their own. It doesn't seem reasonable to just say "SiFive's > > driver landed first, so we will accept no other timer drivers for RISC-V > > systems". > > I share the same views here. > > In ARM 32bit world (arch/arm/), we have the same problem with no standard > timer device, IPI device, and interrupt controller. The ARM GICv2/GICv3 and > ARM Generic Timers were standardized very late in the ARM world so by that > time we had lots of custom timers and interrupt controllers. All these ARM > timer and interrupt controller drivers are now part of drivers/clocksource and > drivers/irqchip. > > The ARM 32bit world has following indirections available to drivers: > 1. set_smp_cross_call() in asm/smp.h for IPI injection > (We have riscv_set_ipi_ops() in asm/smp.h) > 2. register_current_timer_delay() in asm/delay.h > (My patch is proposing riscv_set_read_cycles64() in asm/timex.h) > > For RISC-V S-mode (MMU) kernel, we are using SBI calls for IPIs and > "TIME CSR + SBI calls" (i.e. RISC-V timer) as timer device which simplifies > things for regular S-mode kernel. > > For RISC-V M-mode (NoMMU) kernel, we don't have any SBI provider > so we end-up having separate drivers for timer device, and IPI device > which is similar to ARM 32bit world. > > > > > > But the point is this crap came in after -rc1, and it adds totally > > > pointless indirect calls to the IPI path, and with your "fix" also > > > to get_cycles which all have exactly one implementation for MMU or > > > NOMMU kernels. > > > > > > So the only sensible thing is to revert all this crap. And if at some > > > point we actually have to deal with different implementations do it > > > with alternatives or static_branch infrastructure so that we don't > > > pay the price for indirect calls in the super hot path. > > > > I'm OK reverting the dynamic stuff, as I can buy it needs more time to bake, > > but I'm not sure performance is the right argument -- while this is adding > > an > > indirection, decoupling MMU/NOMMU from the timer driver is the first step > > towards getting rid of the traps which are a way bigger performance issue > > than > > the indirection (not to mention the issue of relying on instructions that > > don't > > technically exist in the ISA we're relying on any more). > > > > I'm not really convinced the timers are on such a hot path that an extra > > load > > is that bad, but I don't have that much experience with this stuff so you > > may > > be right. I'd prefer to keep the driver separate, though, and just bring > > back > > the direct CLINT implementation in timex.h -- we've only got one > > implementation > > for now anyway, so it doesn't seem that bad to just inline it (and I > > suppose I > > could buy that the ISA says this register has to behave this way, though I > > don't think that's all that strong of an argument). > > > > I'm not convinced this is a big performance hit for IPIs either, but we > > could > > just do the same thing over there -- though I guess I'd be much less > > convinced > > about any arguments as to the ISA having a say in that as IIRC it's a lot > > more > > hands off. > > > > Something like this seems to fix the rdtime issue without any extra > > overhead, > > but I haven't tested it > > I had initially thought about directly doing MMIO in asm/timex.h. > > Your patch is CLINT specific because it assumes a 64bit MMIO register which > is always counting upwards. This will break if we have downard counting timer > on some SOC. It will also break if some SOC has implementation specific CSR > for reading cycles. Your patch will also break if the SOC specific timer has a 32bit free-running counter unlike the 64bit free-running counter found on CLINT. I guess it's better to let the SOC timer driver provide the method/function to read the free-running counter. Regards, Anup > > I am fine with your patch if you can address the above mentioned issue. > > > > > diff --git a/arch/riscv/include/asm/clint.h b/arch/riscv/include/asm/clint.h > > new file mode 100644 > > index ..51909ab60ad0 > > --- /dev/null > > +++ b/arch/riscv/include/asm/clint.h > >
Re: v5.9-rc3-rt3 boot time networking lockdep splat
Lappy, which does not use bridge, boots clean... but lock leakage pretty darn quickly inspires lockdep to craps its drawers. [ 209.00] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low! [ 209.001113] turning off the locking correctness validator. [ 209.001114] CPU: 2 PID: 3773 Comm: Socket Thread Tainted: G SI E 5.9.0.gc70672d-rt3-rt #8 [ 209.001117] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017 [ 209.001118] Call Trace: [ 209.001123] dump_stack+0x77/0x9b [ 209.001129] validate_chain+0xf60/0x1230 [ 209.001135] __lock_acquire+0x880/0xbf0 [ 209.001139] lock_acquire+0x92/0x3f0 [ 209.001142] ? rcu_note_context_switch+0x118/0x550 [ 209.001146] ? update_load_avg+0x5cc/0x6d0 [ 209.001150] _raw_spin_lock+0x2f/0x40 [ 209.001153] ? rcu_note_context_switch+0x118/0x550 [ 209.001155] rcu_note_context_switch+0x118/0x550 [ 209.001157] ? lockdep_hardirqs_off+0x6e/0xd0 [ 209.001161] __schedule+0xbe/0xb50 [ 209.001163] ? mark_held_locks+0x2d/0x80 [ 209.001166] preempt_schedule_irq+0x44/0xb0 [ 209.001168] irqentry_exit+0x5b/0x80 [ 209.001170] asm_sysvec_reschedule_ipi+0x12/0x20 [ 209.001173] RIP: 0010:debug_lockdep_rcu_enabled+0x23/0x30 [ 209.001175] Code: 0f 0b e9 6d ff ff ff 8b 05 0a a0 c5 00 85 c0 74 21 8b 05 cc da c5 00 85 c0 74 17 65 48 8b 04 25 c0 91 01 00 8b 80 8c 0a 00 00 <85> c0 0f 94 c0 0f b6 c0 f3 c3 cc cc cc 65 48 8b 04 25 c0 91 01 00 [ 209.001178] RSP: 0018:a00202a0f998 EFLAGS: 0202 [ 209.001179] RAX: RBX: 90a8a6d1da20 RCX: 0001 [ 209.001180] RDX: 0002 RSI: 971308fc RDI: 9710b092 [ 209.001181] RBP: 0048 R08: 0001 R09: 0001 [ 209.001181] R10: 90a8a6d1da38 R11: 0006 R12: 97405280 [ 209.001182] R13: 0008 R14: 97405240 R15: 0100 [ 209.001188] rt_spin_unlock+0x2c/0x90 [ 209.001191] __do_softirq+0xc1/0x5b2 [ 209.001194] ? ip_finish_output2+0x264/0xa10 [ 209.001197] __local_bh_enable_ip+0x230/0x290 [ 209.001200] ip_finish_output2+0x288/0xa10 [ 209.001201] ? rcu_read_lock_held+0x32/0x40 [ 209.001206] ? ip_output+0x70/0x200 [ 209.001207] ip_output+0x70/0x200 [ 209.001210] ? __ip_finish_output+0x320/0x320 [ 209.001212] __ip_queue_xmit+0x1f0/0x5d0 [ 209.001216] __tcp_transmit_skb+0xa7f/0xc70 [ 209.001219] ? __alloc_skb+0x7b/0x1b0 [ 209.001222] ? __kmalloc_node_track_caller+0x252/0x330 [ 209.001230] tcp_rcv_established+0x365/0x6d0 [ 209.001233] tcp_v4_do_rcv+0x7e/0x1b0 [ 209.001236] __release_sock+0x89/0x130 [ 209.001239] release_sock+0x3c/0xd0 [ 209.001241] tcp_recvmsg+0x2b9/0xa90 [ 209.001247] inet_recvmsg+0x6b/0x210 [ 209.001252] __sys_recvfrom+0xb8/0x110 [ 209.001256] ? poll_select_finish+0x1f0/0x1f0 [ 209.001261] ? syscall_enter_from_user_mode+0x37/0x340 [ 209.001263] ? syscall_enter_from_user_mode+0x3c/0x340 [ 209.001265] ? lockdep_hardirqs_on+0x78/0x100 [ 209.001268] __x64_sys_recvfrom+0x24/0x30 [ 209.001269] do_syscall_64+0x33/0x40 [ 209.001271] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 209.001274] RIP: 0033:0x7ff2421a230a [ 209.001276] Code: 7c 24 08 4c 89 14 24 e8 44 f8 ff ff 45 31 c9 89 c3 45 31 c0 4c 8b 14 24 4c 89 e2 48 89 ee 48 8b 7c 24 08 b8 2d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 89 df 48 89 04 24 e8 73 f8 ff ff 48 8b 04 [ 209.001278] RSP: 002b:7ff24243a550 EFLAGS: 0246 ORIG_RAX: 002d [ 209.001279] RAX: ffda RBX: RCX: 7ff2421a230a [ 209.001280] RDX: 34da RSI: 7ff21094fb37 RDI: 006b [ 209.001281] RBP: 7ff21094fb37 R08: R09: [ 209.001282] R10: R11: 0246 R12: 34da [ 209.001283] R13: 7ff21094fb37 R14: R15: 7ff20e8a4000
v5.9-rc3-rt3 boot time networking lockdep splat
[ 22.004225] r8169 :03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off [ 22.004450] br0: port 1(eth0) entered blocking state [ 22.004473] br0: port 1(eth0) entered forwarding state [ 22.006411] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready [ 22.024936] == [ 22.024936] WARNING: possible circular locking dependency detected [ 22.024937] 5.9.0.gc70672d-rt3-rt #8 Tainted: GE [ 22.024938] -- [ 22.024939] ksoftirqd/0/10 is trying to acquire lock: [ 22.024941] 983475521278 (>q.lock){+...}-{0:0}, at: sch_direct_xmit+0x81/0x2f0 [ 22.024947] but task is already holding lock: [ 22.024947] 9834755212b8 (>seqcount#9){+...}-{0:0}, at: br_dev_queue_push_xmit+0x7d/0x180 [bridge] [ 22.024959] which lock already depends on the new lock. [ 22.024960] the existing dependency chain (in reverse order) is: [ 22.024961] -> #1 (>seqcount#9){+...}-{0:0}: [ 22.024963]lock_acquire+0x92/0x3f0 [ 22.024967]__dev_queue_xmit+0xce7/0xe30 [ 22.024969]br_dev_queue_push_xmit+0x7d/0x180 [bridge] [ 22.024974]br_forward_finish+0x10a/0x1b0 [bridge] [ 22.024980]__br_forward+0x17d/0x300 [bridge] [ 22.024984]br_dev_xmit+0x442/0x570 [bridge] [ 22.024990]dev_hard_start_xmit+0xc5/0x3f0 [ 22.024992]__dev_queue_xmit+0x9db/0xe30 [ 22.024993]ip6_finish_output2+0x26a/0x990 [ 22.024995]ip6_output+0x6d/0x260 [ 22.024996]mld_sendpack+0x1d9/0x360 [ 22.024999]mld_ifc_timer_expire+0x1f7/0x370 [ 22.025000]call_timer_fn+0xa0/0x390 [ 22.025003]run_timer_softirq+0x59a/0x720 [ 22.025004]__do_softirq+0xc1/0x5b2 [ 22.025006]run_ksoftirqd+0x47/0x70 [ 22.025007]smpboot_thread_fn+0x266/0x320 [ 22.025009]kthread+0x171/0x190 [ 22.025010]ret_from_fork+0x1f/0x30 [ 22.025013] -> #0 (>q.lock){+...}-{0:0}: [ 22.025015]validate_chain+0xa81/0x1230 [ 22.025016]__lock_acquire+0x880/0xbf0 [ 22.025017]lock_acquire+0x92/0x3f0 [ 22.025018]rt_spin_lock+0x78/0xd0 [ 22.025020]sch_direct_xmit+0x81/0x2f0 [ 22.025022]__dev_queue_xmit+0xd38/0xe30 [ 22.025023]br_dev_queue_push_xmit+0x7d/0x180 [bridge] [ 22.025029]br_forward_finish+0x10a/0x1b0 [bridge] [ 22.025033]__br_forward+0x17d/0x300 [bridge] [ 22.025039]br_dev_xmit+0x442/0x570 [bridge] [ 22.025043]dev_hard_start_xmit+0xc5/0x3f0 [ 22.025044]__dev_queue_xmit+0x9db/0xe30 [ 22.025046]ip6_finish_output2+0x26a/0x990 [ 22.025047]ip6_output+0x6d/0x260 [ 22.025049]mld_sendpack+0x1d9/0x360 [ 22.025050]mld_ifc_timer_expire+0x1f7/0x370 [ 22.025052]call_timer_fn+0xa0/0x390 [ 22.025053]run_timer_softirq+0x59a/0x720 [ 22.025054]__do_softirq+0xc1/0x5b2 [ 22.025055]run_ksoftirqd+0x47/0x70 [ 22.025056]smpboot_thread_fn+0x266/0x320 [ 22.025058]kthread+0x171/0x190 [ 22.025059]ret_from_fork+0x1f/0x30 [ 22.025060] other info that might help us debug this: [ 22.025061] Possible unsafe locking scenario: [ 22.025061]CPU0CPU1 [ 22.025061] [ 22.025062] lock(>seqcount#9); [ 22.025064]lock(>q.lock); [ 22.025065]lock(>seqcount#9); [ 22.025065] lock(>q.lock); [ 22.025066] *** DEADLOCK *** [ 22.025066] 20 locks held by ksoftirqd/0/10: [ 22.025067] #0: 9a4c7140 (rcu_read_lock){}-{1:3}, at: rt_spin_lock+0x5/0xd0 [ 22.025071] #1: 98351ec1a6d0 (per_cpu_ptr(_lock.l.lock, cpu)){}-{3:3}, at: __local_bh_disable_ip+0xbf/0x230 [ 22.025074] #2: 9a4c7140 (rcu_read_lock){}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230 [ 22.025077] #3: 9a4c7140 (rcu_read_lock){}-{1:3}, at: rt_spin_lock+0x5/0xd0 [ 22.025080] #4: 98351ec1b338 (>expiry_lock){+...}-{0:0}, at: run_timer_softirq+0x3e6/0x720 [ 22.025083] #5: b32e8007bd68 ((>mc_ifc_timer)){+...}-{0:0}, at: call_timer_fn+0x5/0x390 [ 22.025086] #6: 9a4c7140 (rcu_read_lock){}-{1:3}, at: mld_sendpack+0x5/0x360 [ 22.025090] #7: 9a4c7140 (rcu_read_lock){}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230 [ 22.025093] #8: 9a4c7100 (rcu_read_lock_bh){}-{1:3}, at: ip6_finish_output2+0x73/0x990 [ 22.025096] #9: 9a4c7140 (rcu_read_lock){}-{1:3}, at: __local_bh_disable_ip+0xfb/0x230 [ 22.025097] #10: 9a4c7100 (rcu_read_lock_bh){}-{1:3}, at: __dev_queue_xmit+0x63/0xe30 [ 22.025100] #11: 9a4c7140 (rcu_read_lock){}-{1:3}, at: br_dev_xmit+0x5/0x570
Re: [PATCH] atm: eni: fix the missed pci_disable_device() for eni_init_one()
On Fri, 4 Sep 2020 10:51:03 +0800 Jing Xiangfeng wrote: > eni_init_one() misses to call pci_disable_device() in an error path. > Jump to err_disable to fix it. > > Signed-off-by: Jing Xiangfeng Please make sure you add appropriate fixes tags, here: Fixes: ede58ef28e10 ("atm: remove deprecated use of pci api") Thank you. Applied.
Re: [PATCH] fscrypt: Reduce object size of logging messages
On Fri, 2020-09-04 at 16:03 -0700, Eric Biggers wrote: > On Fri, Sep 04, 2020 at 03:10:15PM -0700, Joe Perches wrote: > > Reduce the object size of logging messages by removing the > > separate KERN_LEVEL argument and adding it to the format. > > > > Miscellanea: > > > > o Rename fscypt_msg to fscrypt_printk > > > > x86-64 defconfig with fscrypto: > > > > Original sizes: > > $ size fs/crypto/built-in.a -t > >textdata bss dec hex filename > >3815 300 244139102b fs/crypto/crypto.o (ex > > fs/crypto/built-in.a) > >4354 84 044381156 fs/crypto/fname.o (ex > > fs/crypto/built-in.a) > >1484 24 01508 5e4 fs/crypto/hkdf.o (ex > > fs/crypto/built-in.a) > >2910 68 02978 ba2 fs/crypto/hooks.o (ex > > fs/crypto/built-in.a) > >7797 664 658526214e fs/crypto/keyring.o (ex > > fs/crypto/built-in.a) > >5005 493 05498157a fs/crypto/keysetup.o (ex > > fs/crypto/built-in.a) > >2805 0 5443349 d15 fs/crypto/keysetup_v1.o (ex > > fs/crypto/built-in.a) > >6391 90 064811951 fs/crypto/policy.o (ex > > fs/crypto/built-in.a) > >1369 40 01409 581 fs/crypto/bio.o (ex > > fs/crypto/built-in.a) > > 359301763 633 3832695b6 (TOTALS) > > > > New sizes: > > $ size fs/crypto/built-in.a -t > >textdata bss dec hex filename > >3874 300 2441981066 fs/crypto/crypto.o (ex > > fs/crypto/built-in.a) > >4347 84 04431114f fs/crypto/fname.o (ex > > fs/crypto/built-in.a) > >1476 24 01500 5dc fs/crypto/hkdf.o (ex > > fs/crypto/built-in.a) > >2902 68 02970 b9a fs/crypto/hooks.o (ex > > fs/crypto/built-in.a) > >7781 664 658510213e fs/crypto/keyring.o (ex > > fs/crypto/built-in.a) > >4961 493 05454154e fs/crypto/keysetup.o (ex > > fs/crypto/built-in.a) > >2790 0 5443334 d06 fs/crypto/keysetup_v1.o (ex > > fs/crypto/built-in.a) > >6306 90 0639618fc fs/crypto/policy.o (ex > > fs/crypto/built-in.a) > >1369 40 01409 581 fs/crypto/bio.o (ex > > fs/crypto/built-in.a) > > 358061763 633 38202953a (TOTALS) > > > > Signed-off-by: Joe Perches > > --- > > fs/crypto/crypto.c | 14 -- > > fs/crypto/fscrypt_private.h | 12 ++-- > > 2 files changed, 14 insertions(+), 12 deletions(-) > > > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c > > index 9212325763b0..c82cc3907e43 100644 > > --- a/fs/crypto/crypto.c > > +++ b/fs/crypto/crypto.c > > @@ -329,25 +329,27 @@ int fscrypt_initialize(unsigned int cop_flags) > > return err; > > } > > > > -void fscrypt_msg(const struct inode *inode, const char *level, > > -const char *fmt, ...) > > +void fscrypt_printk(const struct inode *inode, const char *fmt, ...) > > { > > static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, > > DEFAULT_RATELIMIT_BURST); > > struct va_format vaf; > > va_list args; > > + int level; > > > > if (!__ratelimit()) > > return; > > > > va_start(args, fmt); > > - vaf.fmt = fmt; > > + level = printk_get_level(fmt); > > + vaf.fmt = printk_skip_level(fmt); > > vaf.va = > > if (inode) > > - printk("%sfscrypt (%s, inode %lu): %pV\n", > > - level, inode->i_sb->s_id, inode->i_ino, ); > > + printk("%c%cfscrypt (%s, inode %lu): %pV\n", > > + KERN_SOH_ASCII, level, inode->i_sb->s_id, inode->i_ino, > > + ); > > else > > - printk("%sfscrypt: %pV\n", level, ); > > + printk("%c%cfscrypt: %pV\n", KERN_SOH_ASCII, level, ); > > va_end(args); > > The problem with this approach is that if fscrypt_printk() is called without > providing a log level in the format string (which one would assume would work, > since printk() allows it), then the real format string will be truncated to > just > KERN_SOH because 'level' will be 0. > Can you find a way to avoid that? While I don't think this is a problem in that all the fscrypt_ calls will always prefix a KERN_, another approach is to use what btrfs uses: char lvl[PRINTK_MAX_SINGLE_HEADER_LEN + 1] = "\0"; ... while ((kern_level = printk_get_level(fmt)) != 0) { size_t size = printk_skip_level(fmt) - fmt; if (kern_level >= '0' && kern_level <= '7') { memcpy(lvl, fmt, size); lvl[size] = '\0'; } fmt += size; } and use "%s...", lvl, ... > > -#define fscrypt_warn(inode, fmt, ...) \ > > - fscrypt_msg((inode),
Re: [PATCH net v2] drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices
On Fri, 4 Sep 2020 18:57:27 -0700 Xie He wrote: > On Fri, Sep 4, 2020 at 6:28 PM Xie He wrote: > > > > The HDLC device is not actually prepending any header when it is used > > with this driver. When the PVC device has prepended its header and > > handed over the skb to the HDLC device, the HDLC device just hands it > > over to the hardware driver for transmission without prepending any > > header. > > > > If we grep "header_ops" and "skb_push" in "hdlc.c" and "hdlc_fr.c", we > > can see there is no "header_ops" implemented in these two files and > > all "skb_push" happen in the PVC device in hdlc_fr.c. > > I want to provide a little more information about the flow after an > HDLC device's ndo_start_xmit is called. > > An HDLC hardware driver's ndo_start_xmit is required to point to > hdlc_start_xmit in hdlc.c. When a HDLC device receives a call to its > ndo_start_xmit, hdlc_start_xmit will check if the protocol driver has > provided a xmit function. If it has provided this function, > hdlc_start_xmit will call it to start transmission. If it has not, > hdlc_start_xmit will directly call the hardware driver's function to > start transmission. This driver (hdlc_fr) has not provided a xmit > function in its hdlc_proto struct, so hdlc_start_xmit will directly > call the hardware driver's function to transmit. > > So no header will be prepended after ndo_start_xmit is called. > > There would not be any header prepended before ndo_start_xmit is > called, either, because there is no header_ops implemented in either > hdlc.c or hdlc_fr.c. Thank you for the detailed explanation. > On Fri, Sep 4, 2020 at 6:28 PM Xie He wrote: > > > > Thank you for your email, Jakub! > > > > On Fri, Sep 4, 2020 at 3:14 PM Jakub Kicinski wrote: > > > > > > Since this is a tunnel protocol on top of HDLC interfaces, and > > > hdlc_setup_dev() sets dev->hard_header_len = 16; should we actually > > > set the needed_headroom to 10 + 16 = 26? I'm not clear on where/if > > > hdlc devices actually prepend 16 bytes of header, though. > > > > The HDLC device is not actually prepending any header when it is used > > with this driver. When the PVC device has prepended its header and > > handed over the skb to the HDLC device, the HDLC device just hands it > > over to the hardware driver for transmission without prepending any > > header. > > > > If we grep "header_ops" and "skb_push" in "hdlc.c" and "hdlc_fr.c", we > > can see there is no "header_ops" implemented in these two files and > > all "skb_push" happen in the PVC device in hdlc_fr.c. > > > > For this reason, I have previously submitted a patch to change the > > value of hard_header_len of the HDLC device from 16 to 0, because it > > is not actually used. > > > > See: > > 2b7bcd967a0f (drivers/net/wan/hdlc: Change the default of hard_header_len > > to 0) Ah, sorry.. the tree I was looking at did not have this commit. > > > > diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c > > > > index 9acad651ea1f..12b35404cd8e 100644 > > > > --- a/drivers/net/wan/hdlc_fr.c > > > > +++ b/drivers/net/wan/hdlc_fr.c > > > > @@ -1041,7 +1041,7 @@ static void pvc_setup(struct net_device *dev) > > > > { > > > > dev->type = ARPHRD_DLCI; > > > > dev->flags = IFF_POINTOPOINT; > > > > - dev->hard_header_len = 10; > > > > + dev->hard_header_len = 0; > > > > > > Is there a need to set this to 0? Will it not be zero after allocation? > > > > Oh. I understand your point. Theoretically we don't need to set it to > > 0 because it already has the default value of 0. I'm setting it to 0 > > only because I want to tell future developers that this value is > > intentionally set to 0, and it is not carelessly missed out. Sounds fair. Applied to net, thank you!
linux-next: manual merge of the akpm-current tree with the net-next tree
Hi all, Today's linux-next merge of the akpm-current tree got a conflict in: mm/filemap.c between commit: 76cd61739fd1 ("mm/error_inject: Fix allow_error_inject function signatures.") from the net-next tree and commit: 2cb138387ead ("mm/filemap: fix storing to a THP shadow entry") from the akpm-current tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc mm/filemap.c index 78d07a712112,054d93a86f8a.. --- a/mm/filemap.c +++ b/mm/filemap.c @@@ -827,10 -827,10 +827,10 @@@ int replace_page_cache_page(struct pag } EXPORT_SYMBOL_GPL(replace_page_cache_page); - static int __add_to_page_cache_locked(struct page *page, - struct address_space *mapping, - pgoff_t offset, gfp_t gfp, - void **shadowp) + noinline int __add_to_page_cache_locked(struct page *page, + struct address_space *mapping, - pgoff_t offset, gfp_t gfp_mask, ++ pgoff_t offset, gfp_t gfp, + void **shadowp) { XA_STATE(xas, >i_pages, offset); int huge = PageHuge(page); pgpPIyzdNrmTo.pgp Description: OpenPGP digital signature
Re: [PATCH] RISC-V: Allow drivers to provide custom read_cycles64 for M-mode kernel
On Sat, Sep 5, 2020 at 6:47 AM Palmer Dabbelt wrote: > > On Fri, 04 Sep 2020 09:57:09 PDT (-0700), Christoph Hellwig wrote: > > On Fri, Sep 04, 2020 at 10:13:18PM +0530, Anup Patel wrote: > >> I respectfully disagree. IMHO, the previous code made the RISC-V > >> timer driver convoluted (both SBI call and CLINT in one place) and > >> mandated CLINT for NoMMU kernel. In fact, RISC-V spec does not > >> mandate CLINT or PLIC. The RISC-V SOC vendors are free to > >> implement their own timer device, IPI device and interrupt controller. > > > > Yes, exactly what we need is everyone coming up with another stupid > > non-standard timer and irq driver. > > Well, we don't have a standard one so there's really no way around people > coming up with their own. It doesn't seem reasonable to just say "SiFive's > driver landed first, so we will accept no other timer drivers for RISC-V > systems". I share the same views here. In ARM 32bit world (arch/arm/), we have the same problem with no standard timer device, IPI device, and interrupt controller. The ARM GICv2/GICv3 and ARM Generic Timers were standardized very late in the ARM world so by that time we had lots of custom timers and interrupt controllers. All these ARM timer and interrupt controller drivers are now part of drivers/clocksource and drivers/irqchip. The ARM 32bit world has following indirections available to drivers: 1. set_smp_cross_call() in asm/smp.h for IPI injection (We have riscv_set_ipi_ops() in asm/smp.h) 2. register_current_timer_delay() in asm/delay.h (My patch is proposing riscv_set_read_cycles64() in asm/timex.h) For RISC-V S-mode (MMU) kernel, we are using SBI calls for IPIs and "TIME CSR + SBI calls" (i.e. RISC-V timer) as timer device which simplifies things for regular S-mode kernel. For RISC-V M-mode (NoMMU) kernel, we don't have any SBI provider so we end-up having separate drivers for timer device, and IPI device which is similar to ARM 32bit world. > > > But the point is this crap came in after -rc1, and it adds totally > > pointless indirect calls to the IPI path, and with your "fix" also > > to get_cycles which all have exactly one implementation for MMU or > > NOMMU kernels. > > > > So the only sensible thing is to revert all this crap. And if at some > > point we actually have to deal with different implementations do it > > with alternatives or static_branch infrastructure so that we don't > > pay the price for indirect calls in the super hot path. > > I'm OK reverting the dynamic stuff, as I can buy it needs more time to bake, > but I'm not sure performance is the right argument -- while this is adding an > indirection, decoupling MMU/NOMMU from the timer driver is the first step > towards getting rid of the traps which are a way bigger performance issue than > the indirection (not to mention the issue of relying on instructions that > don't > technically exist in the ISA we're relying on any more). > > I'm not really convinced the timers are on such a hot path that an extra load > is that bad, but I don't have that much experience with this stuff so you may > be right. I'd prefer to keep the driver separate, though, and just bring back > the direct CLINT implementation in timex.h -- we've only got one > implementation > for now anyway, so it doesn't seem that bad to just inline it (and I suppose I > could buy that the ISA says this register has to behave this way, though I > don't think that's all that strong of an argument). > > I'm not convinced this is a big performance hit for IPIs either, but we could > just do the same thing over there -- though I guess I'd be much less convinced > about any arguments as to the ISA having a say in that as IIRC it's a lot more > hands off. > > Something like this seems to fix the rdtime issue without any extra overhead, > but I haven't tested it I had initially thought about directly doing MMIO in asm/timex.h. Your patch is CLINT specific because it assumes a 64bit MMIO register which is always counting upwards. This will break if we have downard counting timer on some SOC. It will also break if some SOC has implementation specific CSR for reading cycles. I am fine with your patch if you can address the above mentioned issue. > > diff --git a/arch/riscv/include/asm/clint.h b/arch/riscv/include/asm/clint.h > new file mode 100644 > index ..51909ab60ad0 > --- /dev/null > +++ b/arch/riscv/include/asm/clint.h > @@ -0,0 +1,20 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright (C) 2020 Google, Inc > + */ > + > +#ifndef _ASM_RISCV_CLINT_H > +#define _ASM_RISCV_CLINT_H > + > +#include > +#include > + > +#ifdef CONFIG_RISCV_M_MODE > +/* > + * This lives in the CLINT driver, but is accessed directly by timex.h to > avoid > + * any overhead when accessing the MMIO timer. > + */ > +extern u64 __iomem *clint_time_val; > +#endif > + > +#endif > diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h > index
[PATCH] fs: get rid of warnings when built with W=1
There are two warnings when built with W=1: fs/open.c:887: warning: Excess function parameter 'opened' description in 'finish_open' fs/open.c:929: warning: Excess function parameter 'cred' description in 'vfs_open' As there are two comments for deleted parameters, remove them. Signed-off-by: Xiaofei Tan --- fs/open.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/open.c b/fs/open.c index 9af548f..3f6df10 100644 --- a/fs/open.c +++ b/fs/open.c @@ -869,7 +869,6 @@ static int do_dentry_open(struct file *f, * @file: file pointer * @dentry: pointer to dentry * @open: open callback - * @opened: state of open * * This can be used to finish opening a file passed to i_op->atomic_open(). * @@ -923,7 +922,6 @@ EXPORT_SYMBOL(file_path); * vfs_open - open the file at the given path * @path: path to open * @file: newly allocated file with f_flag initialized - * @cred: credentials to use */ int vfs_open(const struct path *path, struct file *file) { -- 2.8.1
Re: [PATCH] tools feature: Add missing -lzstd to the fast path feature detection
On Sat, Sep 5, 2020 at 5:26 AM Arnaldo Carvalho de Melo wrote: > > We were failing that due to GTK2+ and then for the ZSTD test, which made > test-all.c, the fast path feature detection file to fail and thus > trigger building all of the feature tests, slowing down the test. > > Eventually the ZSTD test would be built and would succeed, since it had > the needed -lzstd, avoiding: > > $ cat /tmp/build/perf/feature/test-all.make.output > /usr/bin/ld: /tmp/ccRRJQ4u.o: in function `main_test_libzstd': > /home/acme/git/perf/tools/build/feature/test-libzstd.c:8: undefined > reference to `ZSTD_createCStream' > /usr/bin/ld: /home/acme/git/perf/tools/build/feature/test-libzstd.c:9: > undefined reference to `ZSTD_freeCStream' > collect2: error: ld returned 1 exit status > $ > > Fix it by adding -lzstd to the test-all target. > > Now I need an entry to 'perf test' to make sure that > /tmp/build/perf/feature/test-all.make.output is empty... > > Fixes: 3b1c5d9659718263 ("tools build: Implement libzstd feature check, > LIBZSTD_DIR and NO_LIBZSTD defines") > Cc: Adrian Hunter > Cc: Alexey Budankov > Cc: Ian Rogers > Cc: Jiri Olsa > Cc: Namhyung Kim > Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Namhyung Kim Thanks Namhyung > > --- > > diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile > index 977067e34dff064d..ec815ffca02b 100644 > --- a/tools/build/feature/Makefile > +++ b/tools/build/feature/Makefile > @@ -91,7 +91,7 @@ __BUILDXX = $(CXX) $(CXXFLAGS) -MD -Wall -Werror -o $@ > $(patsubst %.bin,%.cpp,$( > ### > > $(OUTPUT)test-all.bin: > - $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf > -lnuma -lelf -I/usr/include/slang -lslang $(FLAGS_PERL_EMBED) > $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma > + $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf > -lnuma -lelf -I/usr/include/slang -lslang $(FLAGS_PERL_EMBED) > $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma -lzstd > > $(OUTPUT)test-hello.bin: > $(BUILD)
Re: [PATCH 23/23] Documentation: gpio: add documentation for gpio-mockup
Hi, On 9/4/20 8:45 AM, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski > > There's some documentation for gpio-mockup's debugfs interface in the > driver's source but it's not much. Add proper documentation for this > testing module. > > Signed-off-by: Bartosz Golaszewski > --- > .../admin-guide/gpio/gpio-mockup.rst | 87 +++ > 1 file changed, 87 insertions(+) > create mode 100644 Documentation/admin-guide/gpio/gpio-mockup.rst > > diff --git a/Documentation/admin-guide/gpio/gpio-mockup.rst > b/Documentation/admin-guide/gpio/gpio-mockup.rst > new file mode 100644 > index ..1d452ee55f8d > --- /dev/null > +++ b/Documentation/admin-guide/gpio/gpio-mockup.rst > @@ -0,0 +1,87 @@ > +.. SPDX-License-Identifier: GPL-2.0-only > + > +GPIO Testing Driver > +=== > + > +The GPIO Testing Driver (gpio-mockup) provides a way to create simulated GPIO > +chips for testing purposes. There are two ways of configuring the chips > exposed > +by the module. The lines can be accessed using the standard GPIO character > +device interface as well as manipulated using the dedicated debugfs directory > +structure. Could configfs be used for this instead of debugfs? debugfs is ad hoc. > + > +Creating simulated chips using debugfs > +-- > + > +When the gpio-mockup module is loaded (or builtin) it creates its own > directory > +in debugfs. Assuming debugfs is mounted at /sys/kernel/debug/, the directory > +will be located at /sys/kernel/debug/gpio-mockup/. Inside this directory > there > +are two attributes: new_device and delete_device. > + > +New chips can be created by writing a single line containing a number of > +options to "new_device". For example: > + > +.. code-block:: sh > + > +$ echo "label=my-mockup num_lines=4 named_lines" > > /sys/kernel/debug/gpio-mockup/new_device > + > +Supported options: > + > +num_lines= - number of GPIO lines to expose > + > +label= - label of the dummy chip > + > +named_lines - defines whether dummy lines should be named, the names are > + of the form X-Y where X is the chip's label and Y is the > + line's offset > + > +Note: only num_lines is mandatory. > + > +Chips can be dynamically removed by writing the chip's label to > +"delete_device". For example: > + > +.. code-block:: sh > + > +echo "gpio-mockup.0" > /sys/kernel/debug/gpio-mockup/delete_device > + > +Creating simulated chips using module params > + > + > +Note: this is an older, now deprecated method kept for backward compatibility > +for user-space tools. > + > +When loading the gpio-mockup driver a number of parameters can be passed to > the > +module. > + > +gpio_mockup_ranges > + > +This parameter takes an argument in the form of an array of integer > +pairs. Each pair defines the base GPIO number (if any) and the number > +of lines exposed by the chip. If the base GPIO is -1, the gpiolib > +will assign it automatically. > + > +Example: gpio_mockup_ranges=-1,8,-1,16,405,4 > + > +The line above creates three chips. The first one will expose 8 > lines, > +the second 16 and the third 4. The base GPIO for the third chip is > set > +to 405 while for two first chips it will be assigned automatically. > + > +gpio_named_lines > + > +This parameter doesn't take any arguments. It lets the driver know > that > +GPIO lines exposed by it should be named. > + > +The name format is: gpio-mockup-X-Y where X is the letter associated > +with the mockup chip and Y is the line offset. Where does this 'X' letter associated with the mockup chip come from? > + > +Manipulating simulated lines > + > + > +Each mockup chip creates its own subdirectory in > /sys/kernel/debug/gpio-mockup/. > +The directory is named after the chip's label. A symlink is also created, > named > +after the chip's name, which points to the label directory. > + > +Inside each subdirectory, there's a separate attribute for each GPIO line. > The > +name of the attribute represents the line's offset in the chip. > + > +Reading from a line attribute returns the current value. Writing to it (0 or > 1) > +changes its pull. What does "pull" mean here? thanks. -- ~Randy
Re: [PATCH net-next] net/packet: Remove unused macro BLOCK_PRIV
在 2020/9/4 21:26, Willem de Bruijn 写道: On Fri, Sep 4, 2020 at 3:09 PM Wang Hai wrote: BPDU_TYPE_TCN is never used after it was introduced. So better to remove it. This comment does not cover the patch contents. Otherwise the patch looks good to me. Thanks for your review, I will revise this comment. Reported-by: Hulk Robot Signed-off-by: Wang Hai --- net/packet/af_packet.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index da8254e680f9..c430672c6a67 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -177,7 +177,6 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, #define BLOCK_LEN(x) ((x)->hdr.bh1.blk_len) #define BLOCK_SNUM(x) ((x)->hdr.bh1.seq_num) #define BLOCK_O2PRIV(x)((x)->offset_to_priv) -#define BLOCK_PRIV(x) ((void *)((char *)(x) + BLOCK_O2PRIV(x))) struct packet_sock; static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, -- 2.17.1 .
[PATCH] Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()
After we Stop and later Start a VM that uses Accelerated Networking (NIC SR-IOV), currently the VF vmbus device's Instance GUID can change, so after vmbus_bus_resume() -> vmbus_request_offers(), vmbus_onoffer() can not find the original vmbus channel of the VF, and hence we can't complete() vmbus_connection.ready_for_resume_event in check_ready_for_resume_event(), and the VM hangs in vmbus_bus_resume() forever. Fix the issue by adding a timeout, so the resuming can still succeed, and the saved state is not lost, and according to my test, the user can disable Accelerated Networking and then will be able to SSH into the VM for further recovery. Also prevent the VM in question from suspending again. The host will be fixed so in future the Instance GUID will stay the same across hibernation. Fixes: d8bd2d442bb2 ("Drivers: hv: vmbus: Resume after fixing up old primary channels") Signed-off-by: Dexuan Cui --- drivers/hv/vmbus_drv.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 910b6e90866c..946d0aba101f 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -2382,7 +2382,10 @@ static int vmbus_bus_suspend(struct device *dev) if (atomic_read(_connection.nr_chan_close_on_suspend) > 0) wait_for_completion(_connection.ready_for_suspend_event); - WARN_ON(atomic_read(_connection.nr_chan_fixup_on_resume) != 0); + if (atomic_read(_connection.nr_chan_fixup_on_resume) != 0) { + pr_err("Can not suspend due to a previous failed resuming\n"); + return -EBUSY; + } mutex_lock(_connection.channel_mutex); @@ -2456,7 +2459,9 @@ static int vmbus_bus_resume(struct device *dev) vmbus_request_offers(); - wait_for_completion(_connection.ready_for_resume_event); + if (wait_for_completion_timeout( + _connection.ready_for_resume_event, 10 * HZ) == 0) + pr_err("Some vmbus device is missing after suspending?\n"); /* Reset the event for the next suspend. */ reinit_completion(_connection.ready_for_suspend_event); -- 2.19.1
RE: [RFC v2 11/11] scsi: storvsc: Support PAGE_SIZE larger than 4K
From: Boqun Feng Sent: Tuesday, September 1, 2020 8:01 PM > > Hyper-V always use 4k page size (HV_HYP_PAGE_SIZE), so when > communicating with Hyper-V, a guest should always use HV_HYP_PAGE_SIZE > as the unit for page related data. For storvsc, the data is > vmbus_packet_mpb_array. And since in scsi_cmnd, sglist of pages (in unit > of PAGE_SIZE) is used, we need convert pages in the sglist of scsi_cmnd > into Hyper-V pages in vmbus_packet_mpb_array. > > This patch does the conversion by dividing pages in sglist into Hyper-V > pages, offset and indexes in vmbus_packet_mpb_array are recalculated > accordingly. > > Signed-off-by: Boqun Feng > --- > drivers/scsi/storvsc_drv.c | 60 ++ > 1 file changed, 54 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > index 8f5f5dc863a4..3f6610717d4e 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -1739,23 +1739,71 @@ static int storvsc_queuecommand(struct Scsi_Host > *host, struct > scsi_cmnd *scmnd) > payload_sz = sizeof(cmd_request->mpb); > > if (sg_count) { > - if (sg_count > MAX_PAGE_BUFFER_COUNT) { > + unsigned int hvpg_idx = 0; > + unsigned int j = 0; > + unsigned long hvpg_offset = sgl->offset & ~HV_HYP_PAGE_MASK; > + unsigned int hvpg_count = HVPFN_UP(hvpg_offset + length); > > - payload_sz = (sg_count * sizeof(u64) + > + if (hvpg_count > MAX_PAGE_BUFFER_COUNT) { > + > + payload_sz = (hvpg_count * sizeof(u64) + > sizeof(struct vmbus_packet_mpb_array)); > payload = kzalloc(payload_sz, GFP_ATOMIC); > if (!payload) > return SCSI_MLQUEUE_DEVICE_BUSY; > } > > + /* > + * sgl is a list of PAGEs, and payload->range.pfn_array > + * expects the page number in the unit of HV_HYP_PAGE_SIZE (the > + * page size that Hyper-V uses, so here we need to divide PAGEs > + * into HV_HYP_PAGE in case that PAGE_SIZE > HV_HYP_PAGE_SIZE. > + */ > payload->range.len = length; > - payload->range.offset = sgl[0].offset; > + payload->range.offset = sgl[0].offset & ~HV_HYP_PAGE_MASK; > + hvpg_idx = sgl[0].offset >> HV_HYP_PAGE_SHIFT; > > cur_sgl = sgl; > - for (i = 0; i < sg_count; i++) { > - payload->range.pfn_array[i] = > - page_to_pfn(sg_page((cur_sgl))); > + for (i = 0, j = 0; i < sg_count; i++) { > + /* > + * "PAGE_SIZE / HV_HYP_PAGE_SIZE - hvpg_idx" is the # > + * of HV_HYP_PAGEs in the current PAGE. > + * > + * "hvpg_count - j" is the # of unhandled HV_HYP_PAGEs. > + * > + * As shown in the following, the minimal of both is > + * the # of HV_HYP_PAGEs, we need to handle in this > + * PAGE. > + * > + * |-- PAGE --| > + * | PAGE_SIZE / HV_HYP_PAGE_SIZE in total | > + * |hvpg|hvpg| ... |hvpg|... |hvpg| > + * ^ ^ > + * hvpg_idx| > + * ^ | > + * +---(hvpg_count - j)--+ > + * > + * or > + * > + * |-- PAGE --| > + * | PAGE_SIZE / HV_HYP_PAGE_SIZE in total | > + * |hvpg|hvpg| ... |hvpg|... |hvpg| > + * ^ > ^ > + * hvpg_idx > | > + * ^ > | > + * +---(hvpg_count - > j)+ > + */ > + unsigned int nr_hvpg = min((unsigned int)(PAGE_SIZE / > HV_HYP_PAGE_SIZE) - hvpg_idx, > +hvpg_count - j); > + unsigned int k; > + > + for (k = 0; k < nr_hvpg; k++) { > + payload->range.pfn_array[j] = > + page_to_hvpfn(sg_page((cur_sgl))) + > hvpg_idx + k; > + j++; > + } > cur_sgl = sg_next(cur_sgl); > +
[PATCH] PCI: hv: Fix hibernation in case interrupts are not re-created
Hyper-V doesn't trap and emulate the accesses to the MSI/MSI-X registers, and we must use hv_compose_msi_msg() to ask Hyper-V to create the IOMMU Interrupt Remapping Table Entries. This is not an issue for a lot of PCI device drivers (e.g. NVMe driver, Mellanox NIC drivers), which destroy and re-create the interrupts across hibernation, so hv_compose_msi_msg() is called automatically. However, some other PCI device drivers (e.g. the Nvidia driver) may not destroy and re-create the interrupts across hibernation, so hv_pci_resume() has to call hv_compose_msi_msg(), otherwise the PCI device drivers can no longer receive MSI/MSI-X interrupts after hibernation. Fixes: ac82fc832708 ("PCI: hv: Add hibernation support") Cc: Jake Oshins Signed-off-by: Dexuan Cui --- drivers/pci/controller/pci-hyperv.c | 44 + 1 file changed, 44 insertions(+) diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index fc4c3a15e570..abefff9a20e1 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1211,6 +1211,21 @@ static void hv_irq_unmask(struct irq_data *data) pbus = pdev->bus; hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata); + if (hbus->state == hv_pcibus_removing) { + /* +* During hibernatin, when a CPU is offlined, the kernel tries +* to move the interrupt to the remaining CPUs that haven't +* been offlined yet. In this case, the below hv_do_hypercall() +* always fails since the vmbus channel has been closed, so we +* should not call the hypercall, but we still need +* pci_msi_unmask_irq() to reset the mask bit in desc->masked: +* see cpu_disable_common() -> fixup_irqs() -> +* irq_migrate_all_off_this_cpu() -> migrate_one_irq(). +*/ + pci_msi_unmask_irq(data); + return; + } + spin_lock_irqsave(>retarget_msi_interrupt_lock, flags); params = >retarget_msi_interrupt_params; @@ -3372,6 +3387,33 @@ static int hv_pci_suspend(struct hv_device *hdev) return 0; } +static int hv_pci_restore_msi_msg(struct pci_dev *pdev, void *arg) +{ + struct msi_desc *entry; + struct irq_data *irq_data; + + for_each_pci_msi_entry(entry, pdev) { + irq_data = irq_get_irq_data(entry->irq); + if (WARN_ON_ONCE(!irq_data)) + return -EINVAL; + + hv_compose_msi_msg(irq_data, >msg); + } + + return 0; +} + +/* + * Upon resume, pci_restore_msi_state() -> ... -> __pci_write_msi_msg() + * re-writes the MSI/MSI-X registers, but since Hyper-V doesn't trap and + * emulate the accesses, we have to call hv_compose_msi_msg() to ask + * Hyper-V to re-create the IOMMU Interrupt Remapping Table Entries. + */ +static void hv_pci_restore_msi_state(struct hv_pcibus_device *hbus) +{ + pci_walk_bus(hbus->pci_bus, hv_pci_restore_msi_msg, NULL); +} + static int hv_pci_resume(struct hv_device *hdev) { struct hv_pcibus_device *hbus = hv_get_drvdata(hdev); @@ -3405,6 +3447,8 @@ static int hv_pci_resume(struct hv_device *hdev) prepopulate_bars(hbus); + hv_pci_restore_msi_state(hbus); + hbus->state = hv_pcibus_installed; return 0; out: -- 2.19.1
[PATCH net] hv_netvsc: Fix hibernation for mlx5 VF driver
mlx5_suspend()/resume() keep the network interface, so during hibernation netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence netvsc_resume() should call netvsc_vf_changed() to switch the data path back to the VF after hibernation. Similarly, netvsc_suspend() should not call netvsc_unregister_vf(). BTW, mlx4_suspend()/resume() are differnt in that they destroy and re-create the network device, so netvsc_register_vf() and netvsc_unregister_vf() are automatically called. Note: mlx4 can also work with the changes here because in netvsc_suspend()/resume() ndev_ctx->vf_netdev is NULL for mlx4. Fixes: 0efeea5fb153 ("hv_netvsc: Add the support of hibernation") Signed-off-by: Dexuan Cui --- drivers/net/hyperv/netvsc_drv.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 64b0a74c1523..f896059a9588 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2587,7 +2587,7 @@ static int netvsc_remove(struct hv_device *dev) static int netvsc_suspend(struct hv_device *dev) { struct net_device_context *ndev_ctx; - struct net_device *vf_netdev, *net; + struct net_device *net; struct netvsc_device *nvdev; int ret; @@ -2604,10 +2604,6 @@ static int netvsc_suspend(struct hv_device *dev) goto out; } - vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev); - if (vf_netdev) - netvsc_unregister_vf(vf_netdev); - /* Save the current config info */ ndev_ctx->saved_netvsc_dev_info = netvsc_devinfo_get(nvdev); @@ -2623,6 +2619,7 @@ static int netvsc_resume(struct hv_device *dev) struct net_device *net = hv_get_drvdata(dev); struct net_device_context *net_device_ctx; struct netvsc_device_info *device_info; + struct net_device *vf_netdev; int ret; rtnl_lock(); @@ -2635,6 +2632,10 @@ static int netvsc_resume(struct hv_device *dev) netvsc_devinfo_put(device_info); net_device_ctx->saved_netvsc_dev_info = NULL; + vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev); + if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK) + ret = -EINVAL; + rtnl_unlock(); return ret; -- 2.19.1
RE: [PATCH] ice: Fix memleak in ice_set_ringparam
> From: Dinghao Liu > Sent: Wednesday, August 26, 2020 7:34 PM > To: dinghao@zju.edu.cn; k...@umn.edu > Cc: Kirsher, Jeffrey T ; David S. Miller > ; Jakub Kicinski ; Alexei > Starovoitov ; Daniel Borkmann ; > Jesper Dangaard Brouer ; John Fastabend > ; intel-wired-...@lists.osuosl.org; > net...@vger.kernel.org; linux-kernel@vger.kernel.org; > b...@vger.kernel.org > Subject: [PATCH] ice: Fix memleak in ice_set_ringparam > > When kcalloc() on rx_rings fails, we should free tx_rings > and xdp_rings to prevent memleak. Similarly, when > ice_alloc_rx_bufs() fails, we should free xdp_rings. > > Signed-off-by: Dinghao Liu > --- > drivers/net/ethernet/intel/ice/ice_ethtool.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) Tested-by: Aaron Brown
[PATCH] arm64: PCI: fix memleak when calling pci_iomap/unmap()
config GENERIC_IOMAP is disabled on arm64, so pci_iounmap() does nothing, when we using pci_iomap/pci_iounmap(), it will lead to memory leak. Implements pci_iounmap() for arm64 to fix this leak. Fixes: 09a5723983e3 ("arm64: Use include/asm-generic/io.h") Signed-off-by: Yang Yingliang --- arch/arm64/include/asm/io.h | 5 + arch/arm64/kernel/pci.c | 5 + 2 files changed, 10 insertions(+) diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index ff50dd731852d..4d8da06ac295f 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -18,6 +18,11 @@ #include #include +#ifdef CONFIG_PCI +struct pci_dev; +#define pci_iounmap pci_iounmap +extern void pci_iounmap(struct pci_dev *dev, void __iomem *addr); +#endif /* * Generic IO read/write. These perform native-endian accesses. */ diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index 1006ed2d7c604..ddfa1c53def48 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -217,4 +217,9 @@ void pcibios_remove_bus(struct pci_bus *bus) acpi_pci_remove_bus(bus); } +void pci_iounmap(struct pci_dev *dev, void __iomem *addr) +{ + iounmap(addr); +} +EXPORT_SYMBOL(pci_iounmap); #endif -- 2.25.1
Re: [PATCH v3 04/10] PCI/RCEC: Add pcie_walk_rcec() to walk associated RCiEPs
On Fri, Sep 04, 2020 at 10:18:30PM +, Kelley, Sean V wrote: > Hi Bjorn, > > Quick question below... > > On Wed, 2020-09-02 at 14:55 -0700, Sean V Kelley wrote: > > Hi Bjorn, > > > > On Wed, 2020-09-02 at 14:00 -0500, Bjorn Helgaas wrote: > > > On Wed, Aug 12, 2020 at 09:46:53AM -0700, Sean V Kelley wrote: > > > > From: Qiuxu Zhuo > > > > > > > > When an RCEC device signals error(s) to a CPU core, the CPU core > > > > needs to walk all the RCiEPs associated with that RCEC to check > > > > errors. So add the function pcie_walk_rcec() to walk all RCiEPs > > > > associated with the RCEC device. > > > > > > > > Co-developed-by: Sean V Kelley > > > > Signed-off-by: Sean V Kelley > > > > Signed-off-by: Qiuxu Zhuo > > > > Reviewed-by: Jonathan Cameron > > > > --- > > > > drivers/pci/pci.h | 4 +++ > > > > drivers/pci/pcie/rcec.c | 76 > > > > + > > > > 2 files changed, 80 insertions(+) > > > > > > > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > > > > index bd25e6047b54..8bd7528d6977 100644 > > > > --- a/drivers/pci/pci.h > > > > +++ b/drivers/pci/pci.h > > > > @@ -473,9 +473,13 @@ static inline void pci_dpc_init(struct > > > > pci_dev > > > > *pdev) {} > > > > #ifdef CONFIG_PCIEPORTBUS > > > > void pci_rcec_init(struct pci_dev *dev); > > > > void pci_rcec_exit(struct pci_dev *dev); > > > > +void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct > > > > pci_dev > > > > *, void *), > > > > + void *userdata); > > > > #else > > > > static inline void pci_rcec_init(struct pci_dev *dev) {} > > > > static inline void pci_rcec_exit(struct pci_dev *dev) {} > > > > +static inline void pcie_walk_rcec(struct pci_dev *rcec, int > > > > (*cb)(struct pci_dev *, void *), > > > > + void *userdata) {} > > > > #endif > > > > > > > > #ifdef CONFIG_PCI_ATS > > > > diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c > > > > index 519ae086ff41..405f92fcdf7f 100644 > > > > --- a/drivers/pci/pcie/rcec.c > > > > +++ b/drivers/pci/pcie/rcec.c > > > > @@ -17,6 +17,82 @@ > > > > > > > > #include "../pci.h" > > > > > > > > +static int pcie_walk_rciep_devfn(struct pci_bus *bus, int > > > > (*cb)(struct pci_dev *, void *), > > > > +void *userdata, const unsigned long > > > > bitmap) > > > > +{ > > > > + unsigned int devn, fn; > > > > + struct pci_dev *dev; > > > > + int retval; > > > > + > > > > + for_each_set_bit(devn, , 32) { > > > > + for (fn = 0; fn < 8; fn++) { > > > > + dev = pci_get_slot(bus, PCI_DEVFN(devn, fn)); > > > > > > Wow, this is a lot of churning to call pci_get_slot() 256 times per > > > bus for the "associated bus numbers" case where we pass a bitmap of > > > 0x. They didn't really make it easy for software when they > > > added the next/last bus number thing. > > > > > > Just thinking out loud here. What if we could set dev->rcec during > > > enumeration, and then use that to build pcie_walk_rcec()? > > > > I think follow what you are doing. > > > > As we enumerate an RCEC, use the time to discover RCiEPs and > > associate > > each RCiEP's dev->rcec. Although BIOS already set the bitmap for this > > specific RCEC, it's more efficient to simply discover the devices > > through the bus walk and verify each one found against the bitmap. > > > > Further, while we can be certain that an RCiEP found with a matching > > device no. in a bitmap for an associated RCEC is correct, we cannot > > be > > certain that any RCiEP found on another bus range is correct unless > > we > > verify the bus is within that next/last bus range. > > > > Finally, that's where find_rcec() callback for rcec_assoc_rciep() > > does > > double duty by also checking on the "on-a-separate-bus" case captured > > potentially by find_rcec() during an RCiEP's bus walk. > > > > > > > bool rcec_assoc_rciep(rcec, rciep) > > > { > > > if (rcec->bus == rciep->bus) > > > return (rcec->bitmap contains rciep->devfn); > > > > > > return (rcec->next/last contains rciep->bus); > > > } > > > > > > link_rcec(dev, data) > > > { > > > struct pci_dev *rcec = data; > > > > > > if ((dev is RCiEP) && rcec_assoc_rciep(rcec, dev)) > > > dev->rcec = rcec; > > > } > > > > > > find_rcec(dev, data) > > > { > > > struct pci_dev *rciep = data; > > > > > > if ((dev is RCEC) && rcec_assoc_rciep(dev, rciep)) > > > rciep->rcec = dev; > > > } > > > > > > pci_setup_device > > > ... > > I just noticed your use of pci_setup_device(). Are you suggesting > moving the call to pci_rcec_init() out of pci_init_capabilities() and > move it into pci_setup_device()? If so, would pci_rcec_exit() still > remain in pci_release_capabilities()? > > I'm just wondering if it could just remain in pci_init_capabilities(). Yeah, I didn't mean in pci_setup_device() specifically, just
[PATCH v2 2/2] perf metric: Fix some memory leaks - part 2
The metric_event_delete() missed to free expr->metric_events and it should free an expr when metric_refs allocation failed. Cc: Kajol Jain Cc: John Garry Cc: Ian Rogers Fixes: 4ea2896715e67 ("perf metric: Collect referenced metrics in struct metric_expr") Signed-off-by: Namhyung Kim --- tools/perf/util/metricgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index af664d6218d6..b28c09447c10 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -85,6 +85,7 @@ static void metric_event_delete(struct rblist *rblist __maybe_unused, list_for_each_entry_safe(expr, tmp, >head, nd) { free(expr->metric_refs); + free(expr->metric_events); free(expr); } @@ -316,6 +317,7 @@ static int metricgroup__setup_events(struct list_head *groups, if (!metric_refs) { ret = -ENOMEM; free(metric_events); + free(expr); break; } -- 2.28.0.526.ge36021eeef-goog
[PATCH v2 1/2] perf metric: Fix some memory leaks
I found some memory leaks while reading the metric code. Some are real and others only occur in the error path. When it failed during metric or event parsing, it should release all resources properly. Cc: Kajol Jain Cc: John Garry Cc: Ian Rogers Fixes: b18f3e365019d ("perf stat: Support JSON metrics in perf stat") Signed-off-by: Namhyung Kim --- tools/perf/util/metricgroup.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index 8831b964288f..af664d6218d6 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -530,6 +530,9 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter, continue; strlist__add(me->metrics, s); } + + if (!raw) + free(s); } free(omg); } @@ -1040,7 +1043,7 @@ static int parse_groups(struct evlist *perf_evlist, const char *str, ret = metricgroup__add_metric_list(str, metric_no_group, _events, _list, map); if (ret) - return ret; + goto out; pr_debug("adding %s\n", extra_events.buf); bzero(_error, sizeof(parse_error)); ret = __parse_events(perf_evlist, extra_events.buf, _error, fake_pmu); @@ -1048,11 +1051,11 @@ static int parse_groups(struct evlist *perf_evlist, const char *str, parse_events_print_error(_error, extra_events.buf); goto out; } - strbuf_release(_events); ret = metricgroup__setup_events(_list, metric_no_merge, perf_evlist, metric_events); out: metricgroup__free_metrics(_list); + strbuf_release(_events); return ret; } -- 2.28.0.526.ge36021eeef-goog
linux-next: Signed-off-by missing for commit in the printk tree
Hi all, Commit 4c31ead75f41 ("printk: ringbuffer: support dataless records") is missing a Signed-off-by from its committer. -- Cheers, Stephen Rothwell pgp3_FYrKyMH8.pgp Description: OpenPGP digital signature
[PATCH] tools: fix incorrect setting of CC_NO_CLANG
CC_NO_CLANG should be set according to the value of CC after overridden. I have linked /usr/bin/cc to /usr/bin/clang and I built perf with a gcc cross-compiler: $ ARCH=arm64 CROSS_COMPILE=aarch64-calvin-linux-gnu- make -C \ ../linux/tools/perf/ O=$(pwd) It worked well. But when I tried to rebuild that with FIXDEP=1: $ ARCH=arm64 CROSS_COMPILE=aarch64-calvin-linux-gnu- make -C \ ../linux/tools/perf/ O=$(pwd) FIXDEP=1 Every .o files were rebuilt since EXTRA_WARNINGS was changed due to false value of CC_NO_CLANG. Things worked in first build because submake of Makefile.perf inherited CC from first make and CC_NO_CLANG was rectified in submake. Signed-off-by: Calvin Zhang --- tools/scripts/Makefile.include | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index a7974638561c..dc887669828b 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -39,8 +39,6 @@ EXTRA_WARNINGS += -Wundef EXTRA_WARNINGS += -Wwrite-strings EXTRA_WARNINGS += -Wformat -CC_NO_CLANG := $(shell $(CC) -dM -E -x c /dev/null | grep -Fq "__clang__"; echo $$?) - # Makefiles suck: This macro sets a default value of $(2) for the # variable named by $(1), unless the variable has been set by # environment or command line. This is necessary for CC and AR @@ -59,6 +57,8 @@ $(call allow-override,LD,$(CROSS_COMPILE)ld) $(call allow-override,CXX,$(CROSS_COMPILE)g++) $(call allow-override,STRIP,$(CROSS_COMPILE)strip) +CC_NO_CLANG := $(shell $(CC) -dM -E -x c /dev/null | grep -Fq "__clang__"; echo $$?) + ifeq ($(CC_NO_CLANG), 1) EXTRA_WARNINGS += -Wstrict-aliasing=3 endif -- 2.18.4
[Linux-kernel-mentees] [PATCH] Fix uninit-value in hci_chan_lookup_handle
When the amount of data stored in the location corresponding to iov_iter *from is less then 4, some data seems to go uninitialized. Updating this condition accordingly, makes sense both intuitively and logically as well, since the other check for extreme condition done is if len > HCI_MAX_FRAME_SIZE, which is HCI_MAX_ACL_SIZE (which is 1024) + 4; which itself gives some idea about what must be the ideal mininum size. Reported-and-tested by: syzbot+4c14a8f574461e1c3...@syzkaller.appspotmail.com Signed-off-by: Anant Thazhemadam --- If there is some explicit reason why len < 4 doesn't work, and only len < 2 works, please do let me know. The commit message that introduced the initial change (512b2268156a4e15ebf897f9a883bdee153a54b7) wasn't exactly very helpful in this respect, and I couldn't find a whole lot of discussion regarding this either. drivers/bluetooth/hci_vhci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/bluetooth/hci_vhci.c b/drivers/bluetooth/hci_vhci.c index 8ab26dec5f6e..0c49821d7b98 100644 --- a/drivers/bluetooth/hci_vhci.c +++ b/drivers/bluetooth/hci_vhci.c @@ -159,7 +159,7 @@ static inline ssize_t vhci_get_user(struct vhci_data *data, __u8 pkt_type, opcode; int ret; - if (len < 2 || len > HCI_MAX_FRAME_SIZE) + if (len < 4 || len > HCI_MAX_FRAME_SIZE) return -EINVAL; skb = bt_skb_alloc(len, GFP_KERNEL); -- 2.25.1
Re: [PATCH] perf metric: Fix some memory leaks
Hi Arnaldo, On Sat, Sep 5, 2020 at 1:28 AM Arnaldo Carvalho de Melo wrote: > Humm, I assume all those fixes were for csets in a single Linux version, > right? Otherwise I think it'd be better to have a fix per Fixes tag, so > that they would go to the kernel sources where those bugs were fixed. $ git name-rev --tags 9afe5658a6fa8 4ea2896715e67 71b0acce78d12 b18f3e365019d 9afe5658a6fa8 tags/v5.9-rc1~66^2~135 4ea2896715e67 tags/v5.9-rc1~66^2~55 71b0acce78d12 tags/v4.15-rc1~160^2~38^2~36 b18f3e365019d tags/v4.15-rc1~160^2~38^2~37 I'll split it to two - one for v4.15 and another for v5.9. Thanks Namhyung
Re: [PATCH net v2] drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices
On Fri, Sep 4, 2020 at 6:28 PM Xie He wrote: > > The HDLC device is not actually prepending any header when it is used > with this driver. When the PVC device has prepended its header and > handed over the skb to the HDLC device, the HDLC device just hands it > over to the hardware driver for transmission without prepending any > header. > > If we grep "header_ops" and "skb_push" in "hdlc.c" and "hdlc_fr.c", we > can see there is no "header_ops" implemented in these two files and > all "skb_push" happen in the PVC device in hdlc_fr.c. I want to provide a little more information about the flow after an HDLC device's ndo_start_xmit is called. An HDLC hardware driver's ndo_start_xmit is required to point to hdlc_start_xmit in hdlc.c. When a HDLC device receives a call to its ndo_start_xmit, hdlc_start_xmit will check if the protocol driver has provided a xmit function. If it has provided this function, hdlc_start_xmit will call it to start transmission. If it has not, hdlc_start_xmit will directly call the hardware driver's function to start transmission. This driver (hdlc_fr) has not provided a xmit function in its hdlc_proto struct, so hdlc_start_xmit will directly call the hardware driver's function to transmit. So no header will be prepended after ndo_start_xmit is called. There would not be any header prepended before ndo_start_xmit is called, either, because there is no header_ops implemented in either hdlc.c or hdlc_fr.c. On Fri, Sep 4, 2020 at 6:28 PM Xie He wrote: > > Thank you for your email, Jakub! > > On Fri, Sep 4, 2020 at 3:14 PM Jakub Kicinski wrote: > > > > Since this is a tunnel protocol on top of HDLC interfaces, and > > hdlc_setup_dev() sets dev->hard_header_len = 16; should we actually > > set the needed_headroom to 10 + 16 = 26? I'm not clear on where/if > > hdlc devices actually prepend 16 bytes of header, though. > > The HDLC device is not actually prepending any header when it is used > with this driver. When the PVC device has prepended its header and > handed over the skb to the HDLC device, the HDLC device just hands it > over to the hardware driver for transmission without prepending any > header. > > If we grep "header_ops" and "skb_push" in "hdlc.c" and "hdlc_fr.c", we > can see there is no "header_ops" implemented in these two files and > all "skb_push" happen in the PVC device in hdlc_fr.c. > > For this reason, I have previously submitted a patch to change the > value of hard_header_len of the HDLC device from 16 to 0, because it > is not actually used. > > See: > 2b7bcd967a0f (drivers/net/wan/hdlc: Change the default of hard_header_len to > 0) > > > > diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c > > > index 9acad651ea1f..12b35404cd8e 100644 > > > --- a/drivers/net/wan/hdlc_fr.c > > > +++ b/drivers/net/wan/hdlc_fr.c > > > @@ -1041,7 +1041,7 @@ static void pvc_setup(struct net_device *dev) > > > { > > > dev->type = ARPHRD_DLCI; > > > dev->flags = IFF_POINTOPOINT; > > > - dev->hard_header_len = 10; > > > + dev->hard_header_len = 0; > > > > Is there a need to set this to 0? Will it not be zero after allocation? > > Oh. I understand your point. Theoretically we don't need to set it to > 0 because it already has the default value of 0. I'm setting it to 0 > only because I want to tell future developers that this value is > intentionally set to 0, and it is not carelessly missed out.
Re: [PATCH] clk: imx: fix i.MX7D peripheral clk mux flags
On Fri, Aug 28, 2020 at 03:18:50PM +0800, peng@nxp.com wrote: > From: Peng Fan > > According to RM, Page 574, Chapter 5.2.6.4.3 Peripheral clock slice, > "IP clock slices must be stopped to change the clock source.". > > So we must have CLK_SET_PARENT_GATE flag to avoid glitch. > > Signed-off-by: Peng Fan Applied, thanks.
Re: [PATCH v2 0/2] ARM: dts: add Tolino Shine 2 HD
On Wed, Aug 26, 2020 at 10:42:49PM +0200, Andreas Kemnade wrote: > This adds a device tree for the Tolino Shine 2 HD Ebook reader. > > It is equipped with an i.MX6SL SoC. Except for backlight (via an EC) and > the EPD, drivers are available and therefore things are defined in the > dts. > > Andreas Kemnade (2): > dt-bindings: arm: fsl: add compatible string for Tolino Shine 2 HD > ARM: dts: imx: add devicetree for Tolino Shine 2 HD Applied both, thanks.
KASAN: use-after-free Read in delete_partition
Hello, syzbot found the following issue on: HEAD commit:f75aef39 Linux 5.9-rc3 git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=130c72f590 kernel config: https://syzkaller.appspot.com/x/.config?x=3c5f6ce8d5b68299 dashboard link: https://syzkaller.appspot.com/bug?extid=b8639c8dcb5ec4483d4f compiler: gcc (GCC) 10.1.0-syz 20200507 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15c43c7990 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=173dfa1e90 The issue was bisected to: commit cddae808aeb77e5c29d22a8e0dfbdaed413f9e04 Author: Christoph Hellwig Date: Tue Apr 14 07:28:54 2020 + block: pass a hd_struct to delete_partition bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1469647690 final oops: https://syzkaller.appspot.com/x/report.txt?x=1669647690 console output: https://syzkaller.appspot.com/x/log.txt?x=1269647690 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+b8639c8dcb5ec4483...@syzkaller.appspotmail.com Fixes: cddae808aeb7 ("block: pass a hd_struct to delete_partition") == BUG: KASAN: use-after-free in kobject_put+0x220/0x270 lib/kobject.c:748 Read of size 1 at addr 8880978c7a3c by task syz-executor581/7048 CPU: 1 PID: 7048 Comm: syz-executor581 Not tainted 5.9.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 kobject_put+0x220/0x270 lib/kobject.c:748 delete_partition+0x134/0x220 block/partitions/core.c:324 bdev_del_partition+0x18b/0x1d0 block/partitions/core.c:549 blkpg_do_ioctl+0x2d6/0x330 block/ioctl.c:33 blkpg_ioctl block/ioctl.c:69 [inline] blkdev_ioctl+0x58a/0x700 block/ioctl.c:589 block_ioctl+0xf9/0x140 fs/block_dev.c:1871 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x446959 Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db 06 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7fcab8868d98 EFLAGS: 0246 ORIG_RAX: 0010 RAX: ffda RBX: 006dbc38 RCX: 00446959 RDX: 2000 RSI: 1269 RDI: 0005 RBP: 006dbc30 R08: 7fcab8869700 R09: R10: 7fcab8869700 R11: 0246 R12: 006dbc3c R13: 00a9 R14: 00b747111e42e3ec R15: 02ba2b7a04b8ac00 Allocated by task 7050: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461 kmem_cache_alloc_trace+0x174/0x2c0 mm/slab.c:3550 kmalloc include/linux/slab.h:554 [inline] kzalloc include/linux/slab.h:666 [inline] kobject_create lib/kobject.c:783 [inline] kobject_create_and_add+0x42/0xb0 lib/kobject.c:809 add_partition+0x989/0xd80 block/partitions/core.c:443 bdev_add_partition+0xb6/0x130 block/partitions/core.c:518 blkpg_do_ioctl+0x2be/0x330 block/ioctl.c:52 blkpg_ioctl block/ioctl.c:69 [inline] blkdev_ioctl+0x58a/0x700 block/ioctl.c:589 block_ioctl+0xf9/0x140 fs/block_dev.c:1871 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 7049: kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422 __cache_free mm/slab.c:3418 [inline] kfree+0x10e/0x2b0 mm/slab.c:3756 kobject_cleanup lib/kobject.c:704 [inline] kobject_release lib/kobject.c:735 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x171/0x270 lib/kobject.c:752 delete_partition+0x134/0x220 block/partitions/core.c:324 bdev_del_partition+0x18b/0x1d0 block/partitions/core.c:549 blkpg_do_ioctl+0x2d6/0x330 block/ioctl.c:33 blkpg_ioctl block/ioctl.c:69 [inline] blkdev_ioctl+0x58a/0x700 block/ioctl.c:589 block_ioctl+0xf9/0x140 fs/block_dev.c:1871 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy
Re: [PATCH] clk: imx: fix composite peripheral flags
On Wed, Aug 26, 2020 at 03:14:07PM +0800, peng@nxp.com wrote: > From: Peng Fan > > According to RM, for peripheral clock slice, > "IP clock slices must be stopped to change the clock source.". > > So we must have CLK_SET_PARENT_GATE flag to avoid glitch. > > Signed-off-by: Peng Fan Applied, thanks.
[PATCH V2 1/3] efi: Support for MOK variable config table
Because of system-specific EFI firmware limitations, EFI volatile variables may not be capable of holding the required contents of the Machine Owner Key (MOK) certificate store when the certificate list grows above some size. Therefore, an EFI boot loader may pass the MOK certs via a EFI configuration table created specifically for this purpose to avoid this firmware limitation. An EFI configuration table is a much more primitive mechanism compared to EFI variables and is well suited for one-way passage of static information from a pre-OS environment to the kernel. This patch adds initial kernel support to recognize, parse, and validate the EFI MOK configuration table, where named entries contain the same data that would otherwise be provided in similarly named EFI variables. Additionally, this patch creates a sysfs binary file for each EFI MOK configuration table entry found. These files are read-only to root and are provided for use by user space utilities such as mokutil. A subsequent patch will load MOK certs into the trusted platform key ring using this infrastructure. Signed-off-by: Lenny Szubowicz --- arch/x86/kernel/setup.c | 1 + arch/x86/platform/efi/efi.c | 3 + drivers/firmware/efi/Makefile | 1 + drivers/firmware/efi/arm-init.c | 1 + drivers/firmware/efi/efi.c | 6 + drivers/firmware/efi/mokvar-table.c | 360 include/linux/efi.h | 34 +++ 7 files changed, 406 insertions(+) create mode 100644 drivers/firmware/efi/mokvar-table.c diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 3511736fbc74..d41be0df72f8 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1077,6 +1077,7 @@ void __init setup_arch(char **cmdline_p) efi_fake_memmap(); efi_find_mirror(); efi_esrt_init(); + efi_mokvar_table_init(); /* * The EFI specification says that boot service code won't be diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c index d37ebe6e70d7..8a26e705cb06 100644 --- a/arch/x86/platform/efi/efi.c +++ b/arch/x86/platform/efi/efi.c @@ -90,6 +90,9 @@ static const unsigned long * const efi_tables[] = { _log, _final_log, _rng_seed, +#ifdef CONFIG_LOAD_UEFI_KEYS + _table, +#endif }; u64 efi_setup; /* efi setup_data physical address */ diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile index 7a216984552b..03964e2d27c5 100644 --- a/drivers/firmware/efi/Makefile +++ b/drivers/firmware/efi/Makefile @@ -28,6 +28,7 @@ obj-$(CONFIG_EFI_DEV_PATH_PARSER) += dev-path-parser.o obj-$(CONFIG_APPLE_PROPERTIES) += apple-properties.o obj-$(CONFIG_EFI_RCI2_TABLE) += rci2-table.o obj-$(CONFIG_EFI_EMBEDDED_FIRMWARE)+= embedded-firmware.o +obj-$(CONFIG_LOAD_UEFI_KEYS) += mokvar-table.o fake_map-y += fake_mem.o fake_map-$(CONFIG_X86) += x86_fake_mem.o diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c index 71c445d20258..f55a92ff12c0 100644 --- a/drivers/firmware/efi/arm-init.c +++ b/drivers/firmware/efi/arm-init.c @@ -236,6 +236,7 @@ void __init efi_init(void) reserve_regions(); efi_esrt_init(); + efi_mokvar_table_init(); memblock_reserve(data.phys_map & PAGE_MASK, PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 3aa07c3b5136..3d4daf215e19 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -43,6 +43,9 @@ struct efi __read_mostly efi = { .esrt = EFI_INVALID_TABLE_ADDR, .tpm_log= EFI_INVALID_TABLE_ADDR, .tpm_final_log = EFI_INVALID_TABLE_ADDR, +#ifdef CONFIG_LOAD_UEFI_KEYS + .mokvar_table = EFI_INVALID_TABLE_ADDR, +#endif }; EXPORT_SYMBOL(efi); @@ -518,6 +521,9 @@ static const efi_config_table_type_t common_tables[] __initconst = { {EFI_RT_PROPERTIES_TABLE_GUID, _prop, "RTPROP"}, #ifdef CONFIG_EFI_RCI2_TABLE {DELLEMC_EFI_RCI2_TABLE_GUID, _table_phys }, +#endif +#ifdef CONFIG_LOAD_UEFI_KEYS + {LINUX_EFI_MOK_VARIABLE_TABLE_GUID, _table, "MOKvar"}, #endif {}, }; diff --git a/drivers/firmware/efi/mokvar-table.c b/drivers/firmware/efi/mokvar-table.c new file mode 100644 index ..f12f1710f5d9 --- /dev/null +++ b/drivers/firmware/efi/mokvar-table.c @@ -0,0 +1,360 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * mokvar-table.c + * + * Copyright (c) 2020 Red Hat + * Author: Lenny Szubowicz + * + * This module contains the kernel support for the Linux EFI Machine + * Owner Key (MOK) variable configuration table, which is identified by + * the LINUX_EFI_MOK_VARIABLE_TABLE_GUID. + * + * This
[PATCH V2 3/3] integrity: Load certs from the EFI MOK config table
Because of system-specific EFI firmware limitations, EFI volatile variables may not be capable of holding the required contents of the Machine Owner Key (MOK) certificate store when the certificate list grows above some size. Therefore, an EFI boot loader may pass the MOK certs via a EFI configuration table created specifically for this purpose to avoid this firmware limitation. An EFI configuration table is a much more primitive mechanism compared to EFI variables and is well suited for one-way passage of static information from a pre-OS environment to the kernel. This patch adds the support to load certs from the MokListRT entry in the MOK variable configuration table, if it's present. The pre-existing support to load certs from the MokListRT EFI variable remains and is used if the EFI MOK configuration table isn't present or can't be successfully used. Signed-off-by: Lenny Szubowicz --- security/integrity/platform_certs/load_uefi.c | 22 +++ 1 file changed, 22 insertions(+) diff --git a/security/integrity/platform_certs/load_uefi.c b/security/integrity/platform_certs/load_uefi.c index c1c622b4dc78..ee4b4c666854 100644 --- a/security/integrity/platform_certs/load_uefi.c +++ b/security/integrity/platform_certs/load_uefi.c @@ -71,16 +71,38 @@ static __init void *get_cert_list(efi_char16_t *name, efi_guid_t *guid, * Load the certs contained in the UEFI MokListRT database into the * platform trusted keyring. * + * This routine checks the EFI MOK config table first. If and only if + * that fails, this routine uses the MokListRT ordinary UEFI variable. + * * Return: Status */ static int __init load_moklist_certs(void) { + struct efi_mokvar_table_entry *mokvar_entry; efi_guid_t mok_var = EFI_SHIM_LOCK_GUID; void *mok; unsigned long moksize; efi_status_t status; int rc; + /* First try to load certs from the EFI MOKvar config table. +* It's not an error if the MOKvar config table doesn't exist +* or the MokListRT entry is not found in it. +*/ + mokvar_entry = efi_mokvar_entry_find("MokListRT"); + if (mokvar_entry) { + rc = parse_efi_signature_list("UEFI:MokListRT (MOKvar table)", + mokvar_entry->data, + mokvar_entry->data_size, + get_handler_for_db); + /* All done if that worked. */ + if (!rc) + return rc; + + pr_err("Couldn't parse MokListRT signatures from EFI MOKvar config table: %d\n", + rc); + } + /* Get MokListRT. It might not exist, so it isn't an error * if we can't get it. */ -- 2.27.0
[PATCH V2 2/3] integrity: Move import of MokListRT certs to a separate routine
Move the loading of certs from the UEFI MokListRT into a separate routine to facilitate additional MokList functionality. There is no visible functional change as a result of this patch. Although the UEFI dbx certs are now loaded before the MokList certs, they are loaded onto different key rings. So the order of the keys on their respective key rings is the same. Signed-off-by: Lenny Szubowicz --- security/integrity/platform_certs/load_uefi.c | 63 +-- 1 file changed, 44 insertions(+), 19 deletions(-) diff --git a/security/integrity/platform_certs/load_uefi.c b/security/integrity/platform_certs/load_uefi.c index 253fb9a7fc98..c1c622b4dc78 100644 --- a/security/integrity/platform_certs/load_uefi.c +++ b/security/integrity/platform_certs/load_uefi.c @@ -66,6 +66,43 @@ static __init void *get_cert_list(efi_char16_t *name, efi_guid_t *guid, } /* + * load_moklist_certs() - Load MokList certs + * + * Load the certs contained in the UEFI MokListRT database into the + * platform trusted keyring. + * + * Return: Status + */ +static int __init load_moklist_certs(void) +{ + efi_guid_t mok_var = EFI_SHIM_LOCK_GUID; + void *mok; + unsigned long moksize; + efi_status_t status; + int rc; + + /* Get MokListRT. It might not exist, so it isn't an error +* if we can't get it. +*/ + mok = get_cert_list(L"MokListRT", _var, , ); + if (mok) { + rc = parse_efi_signature_list("UEFI:MokListRT", + mok, moksize, get_handler_for_db); + kfree(mok); + if (rc) + pr_err("Couldn't parse MokListRT signatures: %d\n", rc); + return rc; + } + if (status == EFI_NOT_FOUND) + pr_debug("MokListRT variable wasn't found\n"); + else + pr_info("Couldn't get UEFI MokListRT\n"); + return 0; +} + +/* + * load_uefi_certs() - Load certs from UEFI sources + * * Load the certs contained in the UEFI databases into the platform trusted * keyring and the UEFI blacklisted X.509 cert SHA256 hashes into the blacklist * keyring. @@ -73,17 +110,16 @@ static __init void *get_cert_list(efi_char16_t *name, efi_guid_t *guid, static int __init load_uefi_certs(void) { efi_guid_t secure_var = EFI_IMAGE_SECURITY_DATABASE_GUID; - efi_guid_t mok_var = EFI_SHIM_LOCK_GUID; - void *db = NULL, *dbx = NULL, *mok = NULL; - unsigned long dbsize = 0, dbxsize = 0, moksize = 0; + void *db = NULL, *dbx = NULL; + unsigned long dbsize = 0, dbxsize = 0; efi_status_t status; int rc = 0; if (!efi_rt_services_supported(EFI_RT_SUPPORTED_GET_VARIABLE)) return false; - /* Get db, MokListRT, and dbx. They might not exist, so it isn't -* an error if we can't get them. + /* Get db and dbx. They might not exist, so it isn't an error +* if we can't get them. */ if (!uefi_check_ignore_db()) { db = get_cert_list(L"db", _var, , ); @@ -102,20 +138,6 @@ static int __init load_uefi_certs(void) } } - mok = get_cert_list(L"MokListRT", _var, , ); - if (!mok) { - if (status == EFI_NOT_FOUND) - pr_debug("MokListRT variable wasn't found\n"); - else - pr_info("Couldn't get UEFI MokListRT\n"); - } else { - rc = parse_efi_signature_list("UEFI:MokListRT", - mok, moksize, get_handler_for_db); - if (rc) - pr_err("Couldn't parse MokListRT signatures: %d\n", rc); - kfree(mok); - } - dbx = get_cert_list(L"dbx", _var, , ); if (!dbx) { if (status == EFI_NOT_FOUND) @@ -131,6 +153,9 @@ static int __init load_uefi_certs(void) kfree(dbx); } + /* Load the MokListRT certs */ + rc = load_moklist_certs(); + return rc; } late_initcall(load_uefi_certs); -- 2.27.0
[PATCH V2 0/3] integrity: Load certs from EFI MOK config table
Because of system-specific EFI firmware limitations, EFI volatile variables may not be capable of holding the required contents of the Machine Owner Key (MOK) certificate store when the certificate list grows above some size. Therefore, an EFI boot loader may pass the MOK certs via a EFI configuration table created specifically for this purpose to avoid this firmware limitation. An EFI configuration table is a simpler and more robust mechanism compared to EFI variables and is well suited for one-way passage of static information from a pre-OS environment to the kernel. Entries in the MOK variable configuration table are named key/value pairs. Therefore the shim boot loader can create a MokListRT named entry in the MOK configuration table that contains exactly the same data as the MokListRT UEFI variable does or would otherwise contain. As such, the kernel can load certs from the data in the MokListRT configuration table entry data in the same way that it loads certs from the data returned by the EFI GetVariable() runtime call for the MokListRT variable. This patch set does not remove the support for loading certs from the EFI MOK variables into the platform key ring. However, if both the EFI MOK configuration table and corresponding EFI MOK variables are present, the MOK table is used as the source of MOK certs. The contents of the individual named MOK config table entries are made available to user space as individual sysfs binary files, which are read-only to root, under: /sys/firmware/efi/mok-variables/ This enables an updated mokutil to provide support for: mokutil --list-enrolled such that it can provide accurate information regardless of whether the MOK configuration table or MOK EFI variables were the source for certs. Note that all modifications of MOK related state are still initiated by mokutil via EFI variables. V2: Incorporate feedback from V1 Patch 01: efi: Support for MOK variable config table - Minor update to change log; no code changes Patch 02: integrity: Move import of MokListRT certs to a separate routine - Clean up code flow in code moved to load_moklist_certs() - Remove some unnecessary initialization of variables Patch 03: integrity: Load certs from the EFI MOK config table - Update required due to changes in patch 02. - Remove unnecessary init of mokvar_entry in load_moklist_certs() V1: https://lore.kernel.org/lkml/20200826034455.28707-1-lszub...@redhat.com/ Lenny Szubowicz (3): efi: Support for MOK variable config table integrity: Move import of MokListRT certs to a separate routine integrity: Load certs from the EFI MOK config table arch/x86/kernel/setup.c | 1 + arch/x86/platform/efi/efi.c | 3 + drivers/firmware/efi/Makefile | 1 + drivers/firmware/efi/arm-init.c | 1 + drivers/firmware/efi/efi.c| 6 + drivers/firmware/efi/mokvar-table.c | 360 ++ include/linux/efi.h | 34 ++ security/integrity/platform_certs/load_uefi.c | 85 - 8 files changed, 472 insertions(+), 19 deletions(-) create mode 100644 drivers/firmware/efi/mokvar-table.c -- 2.27.0
Re: [PATCH 0/3] integrity: Load certs from EFI MOK config table
On 8/26/20 7:55 AM, Mimi Zohar wrote: Hi Lenny, On Tue, 2020-08-25 at 23:44 -0400, Lenny Szubowicz wrote: Because of system-specific EFI firmware limitations, EFI volatile variables may not be capable of holding the required contents of the Machine Owner Key (MOK) certificate store. Therefore, an EFI boot loader may pass the MOK certs via a EFI configuration table created specifically for this purpose to avoid this firmware limitation. An EFI configuration table is a simpler and more robust mechanism compared to EFI variables and is well suited for one-way passage of static information from a pre-OS environment to the kernel. This patch set does not remove the support for loading certs from the EFI MOK variables into the platform key ring. However, if both the EFI MOK config table and corresponding EFI MOK variables are present, the MOK table is used as the source of MOK certs. The contents of the individual named MOK config table entries are made available to user space via read-only sysfs binary files under: /sys/firmware/efi/mok-variables/ Please include a security section in this cover letter with a comparison of the MoK variables and the EFI configuration table security (eg. same mechanism?). Has mokutil been updated? If so, please provide a link. Mimi I've included some more information about the MOK config table entries in the V2 cover letter. [root@localhost ~]# ls -l /sys/firmware/efi/mok-variables total 0 -r. 1 root root 0 Sep 4 21:10 MokIgnoreDB -r. 1 root root 18184 Sep 4 21:10 MokListRT -r. 1 root root76 Sep 4 21:10 MokListXRT -r. 1 root root 0 Sep 4 21:10 MokSBStateRT The roughly 18KB of data in /sys/firmware/efi/mok-variables/MokListRT is exactly the same data that is returned by a EFI GetVariable() call for MokListRT. Of course, that's on a system where the EFI firmware can handle a volatile variable with that much data. Therefore, load_moklist_certs() can pass the mokvar_entry data directly to parse_efi_signature_list() in the same way it does for the efi.get_variable() data that it obtains via get_cert_list(). Unfortunately, there is no updated mokutil available yet that uses the new sysfs entries. Also relevant is availability of an updated shim, which builds the EFI MOK variable configuration table. Of course, both of these should show up as upstream pull requests and also in Fedora rawhide at some point. Thank you for your review. -Lenny.
Re: [PATCH net v2] drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices
Thank you for your email, Jakub! On Fri, Sep 4, 2020 at 3:14 PM Jakub Kicinski wrote: > > Since this is a tunnel protocol on top of HDLC interfaces, and > hdlc_setup_dev() sets dev->hard_header_len = 16; should we actually > set the needed_headroom to 10 + 16 = 26? I'm not clear on where/if > hdlc devices actually prepend 16 bytes of header, though. The HDLC device is not actually prepending any header when it is used with this driver. When the PVC device has prepended its header and handed over the skb to the HDLC device, the HDLC device just hands it over to the hardware driver for transmission without prepending any header. If we grep "header_ops" and "skb_push" in "hdlc.c" and "hdlc_fr.c", we can see there is no "header_ops" implemented in these two files and all "skb_push" happen in the PVC device in hdlc_fr.c. For this reason, I have previously submitted a patch to change the value of hard_header_len of the HDLC device from 16 to 0, because it is not actually used. See: 2b7bcd967a0f (drivers/net/wan/hdlc: Change the default of hard_header_len to 0) > > diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c > > index 9acad651ea1f..12b35404cd8e 100644 > > --- a/drivers/net/wan/hdlc_fr.c > > +++ b/drivers/net/wan/hdlc_fr.c > > @@ -1041,7 +1041,7 @@ static void pvc_setup(struct net_device *dev) > > { > > dev->type = ARPHRD_DLCI; > > dev->flags = IFF_POINTOPOINT; > > - dev->hard_header_len = 10; > > + dev->hard_header_len = 0; > > Is there a need to set this to 0? Will it not be zero after allocation? Oh. I understand your point. Theoretically we don't need to set it to 0 because it already has the default value of 0. I'm setting it to 0 only because I want to tell future developers that this value is intentionally set to 0, and it is not carelessly missed out.
Re: [PATCH] usb: typec: tcpm: Fix if vbus before cc, hard_reset_count not reset issue
Guenter Roeck 於 2020年9月5日 週六 上午3:41寫道: > > On 9/3/20 9:21 AM, ChiYuan Huang wrote: > > Guenter Roeck 於 2020年9月3日 週四 上午12:57寫道: > >> > >> On Wed, Sep 02, 2020 at 11:35:33PM +0800, cy_huang wrote: > >>> From: ChiYuan Huang > >>> > >>> Fix: If vbus event is before cc_event trigger, hard_reset_count > >>> won't bt reset for some case. > >>> > >>> Signed-off-by: ChiYuan Huang > >>> --- > >>> Below's the flow. > >>> > >>> _tcpm_pd_vbus_off() -> run_state_machine to change state to SNK_UNATTACHED > >>> call tcpm_snk_detach() -> tcpm_snk_detach() -> tcpm_detach() > >>> tcpm_port_is_disconnected() will be called. > >>> But port->attached is still true and port->cc1=open and port->cc2=open > >>> > >>> It cause tcpm_port_is_disconnected return false, then hard_reset_count > >>> won't be reset. > >>> After that, tcpm_reset_port() is called. > >>> port->attached become false. > >>> > >>> After that, cc now trigger cc_change event, the hard_reset_count will be > >>> kept. > >>> Even tcpm_detach will be called, due to port->attached is false, > >>> tcpm_detach() > >>> will directly return. > >>> > >>> CC_EVENT will only trigger drp toggling again. > >>> --- > >>> drivers/usb/typec/tcpm/tcpm.c | 3 +-- > >>> 1 file changed, 1 insertion(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c > >>> index a48e3f90..5c73e1d 100644 > >>> --- a/drivers/usb/typec/tcpm/tcpm.c > >>> +++ b/drivers/usb/typec/tcpm/tcpm.c > >>> @@ -2797,8 +2797,7 @@ static void tcpm_detach(struct tcpm_port *port) > >>> port->tcpc->set_bist_data(port->tcpc, false); > >>> } > >>> > >>> - if (tcpm_port_is_disconnected(port)) > >>> - port->hard_reset_count = 0; > >>> + port->hard_reset_count = 0; > >>> > >> > >> Doesn't that mean that the state machine will never enter > >> error recovery ? > >> > > I think it does't affect the error recovery. > > All error recovery seems to check pd_capable flag. > > > >>From my below case, it's A to C cable only. There is no USBPD contract > > will be estabilished. > > > > This case occurred following by the below test condition > > Cable -> A to C (default Rp bind to vbus) connected to PC. > > 1. first time plugged in the cable with PC > > It will make HARD_RESET_COUNT to be equal 2 > > 2. And then plug out. At that time HARD_RESET_COUNT is till 2. > > 3. next time plugged in again. > > Due to hard_reset_count is still 2 , after wait_cap_timeout, the state > > eventually changed to SNK_READY. > > But during the state transition, no hard_reset be sent. > > > > Defined in the USBPD policy engine, typec transition to USBPD, all > > variables must be reset included hard_reset_count. > > So it expected SNK must send hard_reset again. > > > > The original code defined hard_reset_count must be reset only when > > tcpm_port_is_disconnected. > > > > It doesn't make sense that it only occurred in some scenario. > > If tcpm_detach is called, hard_reset count must be reset also. > > > > If a hard reset fails, the state machine may cycle through states > HARD_RESET_SEND, HARD_RESET_START, SRC_HARD_RESET_VBUS_OFF, > SRC_HARD_RESET_VBUS_ON back to SRC_UNATTACHED. In this state, > tcpm_src_detach() and with it tcpm_detach() is called. The hard > reset counter is incremented in HARD_RESET_SEND. If tcpm_detach() > resets the counter, the state machine will keep cycling through hard > resets without ever entering the error recovery state. I am not > entirely sure where the counter should be reset, but tcpm_detach() > seems to be the wrong place. This case you specified means locally error occurred. It intended to re-run the state machine from typec to USBPD. >From my understanding, hard_reset_count to be reset is reasonable. The normal stare from the state transition you specified is HARD_RESET_SEND, HARD_RESET_START -> SRC_HARD_RESET_VBUS_OFF, SRC_HARD_RESET_VBUS_ON -> received VBUS_EVENT then go to SRC_STARTUP. > > Guenter > > >> Guenter > >> > >>> tcpm_reset_port(port); > >>> } > >>> -- > >>> 2.7.4 > >>> >
Re: [linux-next PATCH v4] drivers/virt/fsl_hypervisor: Fix error handling path
On 9/4/20 6:16 PM, Souptick Joarder wrote: Hi Andrew, On Wed, Sep 2, 2020 at 3:00 AM John Hubbard wrote: On 9/1/20 2:21 PM, Souptick Joarder wrote: First, when memory allocation for sg_list_unaligned failed, there is a bug of calling put_pages() as we haven't pinned any pages. Second, if get_user_pages_fast() failed we should unpin num_pinned pages. This will address both. As part of these changes, minor update in documentation. Fixes: 6db7199407ca ("drivers/virt: introduce Freescale hypervisor management driver") Signed-off-by: Souptick Joarder Reviewed-by: Dan Carpenter Reviewed-by: John Hubbard --- This looks good to me. Can you please take this patch through the mm tree ? Is there a problem with sending it through it's normal tree? It would probably get better testing coverage there. thanks, -- John Hubbard NVIDIA
[PATCH v3 3/4] ima: limit secure boot feedback scope for appraise
Only prompt the unknown/invalid appraisal option if secureboot is enabled and if the current appraisal state is different from the original one. Signed-off-by: Bruno Meneguele --- Changelog: v3: - fix sb_state conditional (Mimi) v2: - update commit message (Mimi) - work with a temporary var instead of directly with ima_appraise (Mimi) security/integrity/ima/ima_appraise.c | 25 - 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c index 2193b51c2743..4f028f6e8f8d 100644 --- a/security/integrity/ima/ima_appraise.c +++ b/security/integrity/ima/ima_appraise.c @@ -19,22 +19,29 @@ static int __init default_appraise_setup(char *str) { #ifdef CONFIG_IMA_APPRAISE_BOOTPARAM - if (arch_ima_get_secureboot()) { - pr_info("Secure boot enabled: ignoring ima_appraise=%s boot parameter option", - str); - return 1; - } + bool sb_state = arch_ima_get_secureboot(); + int appraisal_state = ima_appraise; if (strncmp(str, "off", 3) == 0) - ima_appraise = 0; + appraisal_state = 0; else if (strncmp(str, "log", 3) == 0) - ima_appraise = IMA_APPRAISE_LOG; + appraisal_state = IMA_APPRAISE_LOG; else if (strncmp(str, "fix", 3) == 0) - ima_appraise = IMA_APPRAISE_FIX; + appraisal_state = IMA_APPRAISE_FIX; else if (strncmp(str, "enforce", 7) == 0) - ima_appraise = IMA_APPRAISE_ENFORCE; + appraisal_state = IMA_APPRAISE_ENFORCE; else pr_err("invalid \"%s\" appraise option", str); + + /* If appraisal state was changed, but secure boot is enabled, +* keep its default */ + if (sb_state) { + if (!(appraisal_state & IMA_APPRAISE_ENFORCE)) + pr_info("Secure boot enabled: ignoring ima_appraise=%s option", + str); + } else { + ima_appraise = appraisal_state; + } #endif return 1; } -- 2.26.2
Re: [PATCH v2 3/4] ima: limit secure boot feedback scope for appraise
On Fri, Sep 04, 2020 at 05:07:08PM -0400, Mimi Zohar wrote: > Hi Bruno, > > > + bool sb_state = arch_ima_get_secureboot(); > > + int appraisal_state = ima_appraise; > > > > if (strncmp(str, "off", 3) == 0) > > - ima_appraise = 0; > > + appraisal_state = 0; > > else if (strncmp(str, "log", 3) == 0) > > - ima_appraise = IMA_APPRAISE_LOG; > > + appraisal_state = IMA_APPRAISE_LOG; > > else if (strncmp(str, "fix", 3) == 0) > > - ima_appraise = IMA_APPRAISE_FIX; > > + appraisal_state = IMA_APPRAISE_FIX; > > else if (strncmp(str, "enforce", 7) == 0) > > - ima_appraise = IMA_APPRAISE_ENFORCE; > > + appraisal_state = IMA_APPRAISE_ENFORCE; > > else > > pr_err("invalid \"%s\" appraise option", str); > > + > > + /* If appraisal state was changed, but secure boot is enabled, > > +* keep its default */ > > + if (sb_state) { > > + if (!(appraisal_state & IMA_APPRAISE_ENFORCE)) > > + pr_info("Secure boot enabled: ignoring ima_appraise=%s > > option", > > + str); > > + else > > + ima_appraise = appraisal_state; > > + } > > Shouldn't the "else" clause be here. No need to re-post the entire > patch set. Yes, of course it should. Sorry. Sending the v3 for this patch. > > thanks, > > Mimi > > > #endif > > return 1; > > } > > -- bmeneg PGP Key: http://bmeneg.com/pubkey.txt signature.asc Description: PGP signature
Re: [PATCH] RISC-V: Allow drivers to provide custom read_cycles64 for M-mode kernel
On Fri, 04 Sep 2020 09:57:09 PDT (-0700), Christoph Hellwig wrote: On Fri, Sep 04, 2020 at 10:13:18PM +0530, Anup Patel wrote: I respectfully disagree. IMHO, the previous code made the RISC-V timer driver convoluted (both SBI call and CLINT in one place) and mandated CLINT for NoMMU kernel. In fact, RISC-V spec does not mandate CLINT or PLIC. The RISC-V SOC vendors are free to implement their own timer device, IPI device and interrupt controller. Yes, exactly what we need is everyone coming up with another stupid non-standard timer and irq driver. Well, we don't have a standard one so there's really no way around people coming up with their own. It doesn't seem reasonable to just say "SiFive's driver landed first, so we will accept no other timer drivers for RISC-V systems". But the point is this crap came in after -rc1, and it adds totally pointless indirect calls to the IPI path, and with your "fix" also to get_cycles which all have exactly one implementation for MMU or NOMMU kernels. So the only sensible thing is to revert all this crap. And if at some point we actually have to deal with different implementations do it with alternatives or static_branch infrastructure so that we don't pay the price for indirect calls in the super hot path. I'm OK reverting the dynamic stuff, as I can buy it needs more time to bake, but I'm not sure performance is the right argument -- while this is adding an indirection, decoupling MMU/NOMMU from the timer driver is the first step towards getting rid of the traps which are a way bigger performance issue than the indirection (not to mention the issue of relying on instructions that don't technically exist in the ISA we're relying on any more). I'm not really convinced the timers are on such a hot path that an extra load is that bad, but I don't have that much experience with this stuff so you may be right. I'd prefer to keep the driver separate, though, and just bring back the direct CLINT implementation in timex.h -- we've only got one implementation for now anyway, so it doesn't seem that bad to just inline it (and I suppose I could buy that the ISA says this register has to behave this way, though I don't think that's all that strong of an argument). I'm not convinced this is a big performance hit for IPIs either, but we could just do the same thing over there -- though I guess I'd be much less convinced about any arguments as to the ISA having a say in that as IIRC it's a lot more hands off. Something like this seems to fix the rdtime issue without any extra overhead, but I haven't tested it diff --git a/arch/riscv/include/asm/clint.h b/arch/riscv/include/asm/clint.h new file mode 100644 index ..51909ab60ad0 --- /dev/null +++ b/arch/riscv/include/asm/clint.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2020 Google, Inc + */ + +#ifndef _ASM_RISCV_CLINT_H +#define _ASM_RISCV_CLINT_H + +#include +#include + +#ifdef CONFIG_RISCV_M_MODE +/* + * This lives in the CLINT driver, but is accessed directly by timex.h to avoid + * any overhead when accessing the MMIO timer. + */ +extern u64 __iomem *clint_time_val; +#endif + +#endif diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h index a3fb85d505d4..7f659dda0032 100644 --- a/arch/riscv/include/asm/timex.h +++ b/arch/riscv/include/asm/timex.h @@ -10,6 +10,31 @@ typedef unsigned long cycles_t; +#ifdef CONFIG_RISCV_M_MODE + +#include + +#ifdef CONFIG_64BIT +static inline cycles_t get_cycles(void) +{ + return readq_relaxed(clint_time_val); +} +#else /* !CONFIG_64BIT */ +static inline u32 get_cycles(void) +{ + return readl_relaxed(((u32 *)clint_time_val)); +} +#define get_cycles get_cycles + +static inline u32 get_cycles_hi(void) +{ + return readl_relaxed(((u32 *)clint_time_val) + 1); +} +#define get_cycles_hi get_cycles_hi +#endif /* CONFIG_64BIT */ + +#else /* CONFIG_RISCV_M_MODE */ + static inline cycles_t get_cycles(void) { return csr_read(CSR_TIME); @@ -41,6 +66,8 @@ static inline u64 get_cycles64(void) } #endif /* CONFIG_64BIT */ +#endif /* !CONFIG_RISCV_M_MODE */ + #define ARCH_HAS_READ_CURRENT_TIMER static inline int read_current_timer(unsigned long *timer_val) { diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c index 8eeafa82c03d..43ae0f885bfa 100644 --- a/drivers/clocksource/timer-clint.c +++ b/drivers/clocksource/timer-clint.c @@ -19,6 +19,11 @@ #include #include #include +#include + +#ifndef CONFIG_MMU +#include +#endif #define CLINT_IPI_OFF 0 #define CLINT_TIMER_CMP_OFF 0x4000 @@ -31,6 +36,10 @@ static u64 __iomem *clint_timer_val; static unsigned long clint_timer_freq; static unsigned int clint_timer_irq; +#ifdef CONFIG_RISCV_M_MODE +u64 __iomem *clint_time_val; +#endif + static void clint_send_ipi(const struct cpumask *target) { unsigned int cpu; @@ -184,6 +193,14 @@ static int __init clint_timer_init_dt(struct device_node
Re: [linux-next PATCH v4] drivers/virt/fsl_hypervisor: Fix error handling path
Hi Andrew, On Wed, Sep 2, 2020 at 3:00 AM John Hubbard wrote: > > On 9/1/20 2:21 PM, Souptick Joarder wrote: > > First, when memory allocation for sg_list_unaligned failed, there > > is a bug of calling put_pages() as we haven't pinned any pages. > > > > Second, if get_user_pages_fast() failed we should unpin num_pinned > > pages. > > > > This will address both. > > > > As part of these changes, minor update in documentation. > > > > Fixes: 6db7199407ca ("drivers/virt: introduce Freescale hypervisor > > management driver") > > Signed-off-by: Souptick Joarder > > Reviewed-by: Dan Carpenter > > Reviewed-by: John Hubbard > > --- > > This looks good to me. Can you please take this patch through the mm tree ? > > thanks, > -- > John Hubbard > NVIDIA > > > v2: > > Added review tag. > > > > v3: > > Address review comment on v2 from John. > > Added review tag. > > > > v4: > >Address another set of review comments from John. > > > > drivers/virt/fsl_hypervisor.c | 17 - > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/virt/fsl_hypervisor.c b/drivers/virt/fsl_hypervisor.c > > index 1b0b11b..46ee0a0 100644 > > --- a/drivers/virt/fsl_hypervisor.c > > +++ b/drivers/virt/fsl_hypervisor.c > > @@ -157,7 +157,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy > > __user *p) > > > > unsigned int i; > > long ret = 0; > > - int num_pinned; /* return value from get_user_pages() */ > > + int num_pinned = 0; /* return value from get_user_pages_fast() */ > > phys_addr_t remote_paddr; /* The next address in the remote buffer */ > > uint32_t count; /* The number of bytes left to copy */ > > > > @@ -174,7 +174,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy > > __user *p) > > return -EINVAL; > > > > /* > > - * The array of pages returned by get_user_pages() covers only > > + * The array of pages returned by get_user_pages_fast() covers only > >* page-aligned memory. Since the user buffer is probably not > >* page-aligned, we need to handle the discrepancy. > >* > > @@ -224,7 +224,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy > > __user *p) > > > > /* > >* 'pages' is an array of struct page pointers that's initialized by > > - * get_user_pages(). > > + * get_user_pages_fast(). > >*/ > > pages = kcalloc(num_pages, sizeof(struct page *), GFP_KERNEL); > > if (!pages) { > > @@ -241,7 +241,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy > > __user *p) > > if (!sg_list_unaligned) { > > pr_debug("fsl-hv: could not allocate S/G list\n"); > > ret = -ENOMEM; > > - goto exit; > > + goto free_pages; > > } > > sg_list = PTR_ALIGN(sg_list_unaligned, sizeof(struct fh_sg_list)); > > > > @@ -250,7 +250,6 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy > > __user *p) > > num_pages, param.source != -1 ? FOLL_WRITE : 0, pages); > > > > if (num_pinned != num_pages) { > > - /* get_user_pages() failed */ > > pr_debug("fsl-hv: could not lock source buffer\n"); > > ret = (num_pinned < 0) ? num_pinned : -EFAULT; > > goto exit; > > @@ -292,13 +291,13 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy > > __user *p) > > virt_to_phys(sg_list), num_pages); > > > > exit: > > - if (pages) { > > - for (i = 0; i < num_pages; i++) > > - if (pages[i]) > > - put_page(pages[i]); > > + if (pages && (num_pinned > 0)) { > > + for (i = 0; i < num_pinned; i++) > > + put_page(pages[i]); > > } > > > > kfree(sg_list_unaligned); > > +free_pages: > > kfree(pages); > > > > if (!ret) > > >
Re: [PATCH] mm/gup: don't permit users to call get_user_pages with FOLL_LONGTERM
On Fri, Sep 4, 2020 at 12:09 AM Matthew Wilcox wrote: > > On Thu, Sep 03, 2020 at 12:42:44PM +0530, Souptick Joarder wrote: > > We can use is_valid_gup_flags() inside -> > > get_user_pages_locked(), > > get_user_pages_unlocked(), > > pin_user_pages_locked() as well. > > > > Are you planning to add it in future patches ? > > If you're looking for a new project, adding a foll_t or gup_t or > something for the FOLL flags (like we have for gfp_t or vm_fault_t) > would be helpful. We're inconsistent with our naming here. Sure. I will start looking into this and come up with a RFC version.
[GIT PULL] ARC updates for 5.9-rc4
Hi Linus, Please pull. Thx, -Vineet ---> The following changes since commit 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5: Linux 5.9-rc1 (2020-08-16 13:04:57 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc.git/ tags/arc-5.9-rc4 for you to fetch changes up to 26907eb605fbc3ba9dbf888f21d9d8d04471271d: ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id (2020-09-01 11:59:04 -0700) ARC fixes for 5.9-rc4 - HSDK-4xd Dev system: perf driver updates for sampling interrupt - HSDK* Dev System : Ethernet broken [Evgeniy Didin] - HIGHMEM broken (2 memory banks) [Mike Rapoport] - show_regs() rewrite once and for all - Other minor fixes Evgeniy Didin (1): ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id Mike Rapoport (1): arc: fix memory initialization for systems with two memory banks Randy Dunlap (1): ARC: pgalloc.h: delete a duplicated word + other fixes Vineet Gupta (4): ARC: perf: don't bail setup if pct irq missing in device-tree ARC: HSDK: wireup perf irq ARC: show_regs: fix r12 printing and simplify irqchip/eznps: Fix build error for !ARC700 builds arch/arc/boot/dts/hsdk.dts | 6 ++- arch/arc/include/asm/pgalloc.h | 4 +- arch/arc/kernel/perf_event.c| 14 ++ arch/arc/kernel/troubleshoot.c | 77 + arch/arc/mm/init.c | 27 +++- arch/arc/plat-eznps/include/plat/ctop.h | 1 - include/soc/nps/common.h| 6 +++ 7 files changed, 62 insertions(+), 73 deletions(-)
Re: [PATCH 2/3] integrity: Move import of MokListRT certs to a separate routine
On 9/2/20 3:55 AM, Andy Shevchenko wrote: On Wed, Aug 26, 2020 at 6:45 AM Lenny Szubowicz wrote: Move the loading of certs from the UEFI MokListRT into a separate routine to facilitate additional MokList functionality. There is no visible functional change as a result of this patch. Although the UEFI dbx certs are now loaded before the MokList certs, they are loaded onto different key rings. So the order of the keys on their respective key rings is the same. ... /* + * load_moklist_certs() - Load MokList certs + * + * Returns:Summary error status + * + * Load the certs contained in the UEFI MokListRT database into the + * platform trusted keyring. + */ Hmm... Is it intentionally kept out of kernel doc format? Yes. Since this is a static local routine, I thought that it shouldn't be included by kerneldoc. But I wanted to generally adhere to the kernel doc conventions for a routine header. To that end, in V2 I move the "Return:" section to come after the short description. +static int __init load_moklist_certs(void) +{ + efi_guid_t mok_var = EFI_SHIM_LOCK_GUID; + void *mok = NULL; + unsigned long moksize = 0; + efi_status_t status; + int rc = 0; Redundant assignment (see below). + /* Get MokListRT. It might not exist, so it isn't an error +* if we can't get it. +*/ + mok = get_cert_list(L"MokListRT", _var, , ); + if (!mok) { Why not positive conditional? Sometimes ! is hard to notice. + if (status == EFI_NOT_FOUND) + pr_debug("MokListRT variable wasn't found\n"); + else + pr_info("Couldn't get UEFI MokListRT\n"); + } else { + rc = parse_efi_signature_list("UEFI:MokListRT", + mok, moksize, get_handler_for_db); + if (rc) + pr_err("Couldn't parse MokListRT signatures: %d\n", rc); + kfree(mok); kfree(...) if (rc) ... return rc; And with positive conditional there will be no need to have redundant 'else' followed by additional level of indentation. + } + return rc; return 0; +} P.S. Yes, I see that the above was in the original code, so, consider my comments as suggestions to improve the code. I agree that your suggestions improve the code. I've incorporated this into V2. -Lenny.
Re: [PATCH v1 01/12] fpga: fpga security manager class driver
On 9/4/20 5:23 PM, Moritz Fischer wrote: Hi Russ, On Fri, Sep 04, 2020 at 04:52:54PM -0700, Russ Weight wrote: Create the Intel Security Manager class driver. The security manager provides interfaces to manage secure updates for the FPGA and BMC images that are stored in FLASH. The driver can also be used to update root entry hashes and to cancel code signing keys. This patch creates the class driver and provides sysfs interfaces for displaying root entry hashes, canceled code signing keys and flash counts. Signed-off-by: Russ Weight Signed-off-by: Xu Yilun As for Reviewed-by tags I had seen on other patches in the series, I'd prefer for that to happen on public mailing lists. If Hao reviewed patches on some internal Intel list I won't know about it, so please have him properly Ack/Reviewed-by tag things on a public mailing list. Sure - I'll remove the Ack/Reviewed-by tags that were added internally before I submit the next version of the patchset (except where Hao re-adds them on the public list during this review cycle). --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 75 MAINTAINERS | 8 + drivers/fpga/Kconfig | 9 + drivers/fpga/Makefile | 3 + drivers/fpga/ifpga-sec-mgr.c | 339 ++ include/linux/fpga/ifpga-sec-mgr.h| 145 6 files changed, 579 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr create mode 100644 drivers/fpga/ifpga-sec-mgr.c create mode 100644 include/linux/fpga/ifpga-sec-mgr.h diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr new file mode 100644 index ..86f8992559bf --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -0,0 +1,75 @@ +What: /sys/class/ifpga_sec_mgr/ifpga_secX/name +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Name of low level fpga security manager driver. + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/sr_root_entry_hash +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns the root entry hash for the static + region if one is programmed, else it returns the + string: "hash not programmed". This file is only + visible if the underlying device supports it. + Format: "0x%x". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/pr_root_entry_hash +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns the root entry hash for the partial + reconfiguration region if one is programmed, else it + returns the string: "hash not programmed". This file + is only visible if the underlying device supports it. + Format: "0x%x". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_root_entry_hash +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns the root entry hash for the BMC image + if one is programmed, else it returns the string: + "hash not programmed". This file is only visible if the + underlying device supports it. + Format: "0x%x". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/sr_canceled_csks +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns a list of indices for canceled code + signing keys for the static region. The standard bitmap + list format is used (e.g. "1,2-6,9"). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/pr_canceled_csks +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns a list of indices for canceled code + signing keys for the partial reconfiguration region. The + standard bitmap list format is used (e.g. "1,2-6,9"). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_canceled_csks +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns a list of indices for canceled code + signing keys for the BMC. The standard bitmap list format + is used (e.g. "1,2-6,9"). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/user_flash_count +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns number of times the user image for the + static region has been flashed. + Format: "%d". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_flash_count +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ
Re: [PATCH v7 3/3] fpga: dfl: add support for N3000 Nios private feature
Hi Xu, On Wed, Aug 19, 2020 at 03:45:21PM +0800, Xu Yilun wrote: > This patch adds support for the Nios handshake private feature on Intel > PAC (Programmable Acceleration Card) N3000. > > The Nios is the embedded processor on the FPGA card. This private feature > provides a handshake interface to FPGA Nois firmware, which receives FPGA *NIOS* or *Nios* I think ;-) > retimer configuration command from host and executes via an internal SPI > master (spi-altera). When Nios finishes the configuration, host takes over > the ownership of the SPI master to control an Intel MAX10 BMC (Board > Management Controller) Chip on the SPI bus. > > For Nios firmware handshake part, this driver requests the retimer > configuration for Nios with parameters from module param, and adds some > sysfs nodes for user to query the onboard retimer's working mode and > Nios firmware version. > > For SPI part, this driver adds a spi-altera platform device as well as > the MAX10 BMC spi slave info. A spi-altera driver will be matched to > handle the following SPI work. > > Signed-off-by: Xu Yilun > Signed-off-by: Wu Hao > Signed-off-by: Matthew Gerlach > Signed-off-by: Russ Weight > Reviewed-by: Tom Rix > Acked-by: Wu Hao > --- > v3: Add the doc for this driver > Minor fixes for comments from Tom > v4: Move the err log in regmap implementation, and delete > n3000_nios_writel/readl(), they have nothing to wrapper now. > Some minor fixes and comments improvement. > v5: Fix the output of fec_mode sysfs inf to "no" on 10G configuration, > cause no FEC mode could be configured for 10G. > Rename the dfl_n3000_nios_* to n3000_nios_* > Improves comments. > v6: Fix the output of fec_mode sysfs inf to "not supported" if in 10G, > or the firmware version major < 3. > Minor fixes and improves comments. > v7: Improves comments. > --- > .../ABI/testing/sysfs-bus-dfl-devices-n3000-nios | 21 + > Documentation/fpga/dfl-n3000-nios.rst | 80 +++ > Documentation/fpga/index.rst | 1 + > drivers/fpga/Kconfig | 11 + > drivers/fpga/Makefile | 2 + > drivers/fpga/dfl-n3000-nios.c | 542 > + > 6 files changed, 657 insertions(+) > create mode 100644 Documentation/ABI/testing/sysfs-bus-dfl-devices-n3000-nios > create mode 100644 Documentation/fpga/dfl-n3000-nios.rst > create mode 100644 drivers/fpga/dfl-n3000-nios.c > > diff --git a/Documentation/ABI/testing/sysfs-bus-dfl-devices-n3000-nios > b/Documentation/ABI/testing/sysfs-bus-dfl-devices-n3000-nios > new file mode 100644 > index 000..ce5b474 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-bus-dfl-devices-n3000-nios > @@ -0,0 +1,21 @@ > +What:/sys/bus/dfl/devices/dfl_dev.X/fec_mode > +Date:Aug 2020 > +KernelVersion: 5.10 > +Contact: Xu Yilun > +Description: Read-only. It returns the FEC mode of the 25G links of the > + ethernet retimers configured by NIOS firmware. "rs" for Reed > + Solomon FEC, "kr" for Fire Code FEC, "no" for NO FEC. > + "not supported" if the FEC mode setting is not supported, this > + happens when the Nios firmware version major < 3, or no link is > + configured to 25G. The FEC mode could be set by module > + parameters, but it could only be set once after the board > + powers up. > + Format: string > + > +What:/sys/bus/dfl/devices/dfl_dev.X/nios_fw_version > +Date:Aug 2020 > +KernelVersion: 5.10 > +Contact: Xu Yilun > +Description: Read-only. It returns the version of the NIOS firmware in FPGA. > + Its format is "major.minor.patch". > + Format: %x.%x.%x > diff --git a/Documentation/fpga/dfl-n3000-nios.rst > b/Documentation/fpga/dfl-n3000-nios.rst > new file mode 100644 > index 000..72dd600 > --- /dev/null > +++ b/Documentation/fpga/dfl-n3000-nios.rst > @@ -0,0 +1,80 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > += > +N3000 Nios Private Feature Driver > += > + > +The N3000 Nios driver supports for the Nios handshake private feature on > Intel > +PAC (Programmable Acceleration Card) N3000. > + > +The Nios is the embedded processor in the FPGA, it will configure the 2 > onboard > +ethernet retimers on power up. This private feature provides a handshake > +interface to FPGA Nios firmware, which receives the ethernet retimer > +configuration command from host and does the configuration via an internal > SPI > +master (spi-altera). When Nios finishes the configuration, host takes over > the > +ownership of the SPI master to control an Intel MAX10 BMC (Board Management > +Controller) Chip on the SPI bus. > + > +So the driver does 2 major tasks on probe, uses the Nios firmware to > configure > +the ethernet
Re: [PATCH 2/2] Add a new sysctl knob: unprivileged_userfaultfd_user_mode_only
On Thu, Sep 3, 2020 at 8:34 PM Andrea Arcangeli wrote: > > Hello, > > On Mon, Aug 17, 2020 at 03:11:16PM -0700, Lokesh Gidra wrote: > > There has been an emphasis that Android is probably the only user for > > the restriction of userfaults from kernel-space and that it wouldn’t > > be useful anywhere else. I humbly disagree! There are various areas > > where the PROT_NONE+SIGSEGV trick is (and can be) used in a purely > > user-space setting. Basically, any lazy, on-demand, > > For the record what I said is quoted below > https://lkml.kernel.org/r/20200520194804.gj26...@redhat.com : > > """It all boils down of how peculiar it is to be able to leverage only > the acceleration [..] Right now there's a single user that can cope > with that limitation [..] If there will be more users [..] it'd be > fine to add a value "2" later.""" > > Specifically I never said "that it wouldn’t be useful anywhere else.". > Thanks a lot for clarifying. > Also I'm only arguing about the sysctl visible kABI change in patch > 2/2: the flag passed as parameter to the syscall in patch 1/2 is all > great, because seccomp needs it in the scalar parameter of the syscall > to implement a filter equivalent to your sysctl "2" policy with only > patch 1/2 applied. > > I've two more questions now: > > 1) why don't you enforce the block of kernel initiated faults with >seccomp-bpf instead of adding a sysctl value 2? Is the sysctl just >an optimization to remove a few instructions per syscall in the bpf >execution of Android unprivileged apps? You should block a lot of >other syscalls by default to all unprivileged processes, including >vmsplice. > >In other words if it's just for Android, why can't Android solve it >with only patch 1/2 by tweaking the seccomp filter? I would let Nick (nnk@) and Jeff (jeffv@) respond to this. The previous responses from both of them on this email thread (https://lore.kernel.org/lkml/CABXk95A-E4NYqA5qVrPgDF18YW-z4_udzLwa0cdo2OfqVsy=s...@mail.gmail.com/ and https://lore.kernel.org/lkml/CAFJ0LnGfrzvVgtyZQ+UqRM6F3M7iXOhTkUBTc+9sV+=rrfn...@mail.gmail.com/) suggest that the performance overhead of seccomp-bpf is too much. Kees also objected to it (https://lore.kernel.org/lkml/202005200921.2BD5A0ADD@keescook/) I'm not familiar with how seccomp-bpf works. All that I can add here is that userfaultfd syscall is usually not invoked in a performance critical code path. So, if the performance overhead of seccomp-bpf (if enabled) is observed on all syscalls originating from a process, then I'd say patch 2/2 is essential. Otherwise, it should be ok to let seccomp perform the same functionality instead. > > 2) given that Android is secure enough with the sysctl at value 2, why >should we even retain the current sysctl 0 semantics? Why can't >more secure systems just use seccomp and block userfaultfd, as it >is already happens by default in the podman default seccomp >whitelist (for those containers that don't define a new json >whitelist in the OCI schema)? Shouldn't we focus our energy in >making containers more secure by preventing the OCI schema of a >random container to re-enable userfaultfd in the container seccomp >filter instead of trying to solve this with a global sysctl? > >What's missing in my view is a kubernetes hard allowlist/denylist >that cannot be overridden with the OCI schema in case people has >the bad idea of running containers downloaded from a not fully >trusted source, without adding virt isolation and that's an >userland problem to be solved in the container runtime, not a >kernel issue. Then you'd just add userfaultfd to the json of the >k8s hard seccomp denylist instead of going around tweaking sysctl. > > What's your take in changing your 2/2 patch to just replace value "0" > and avoid introducing a new value "2"? SGTM. Disabling uffd completely for unprivileged processes can be achieved either using seccomp-bpf, or via SELinux, once the following patch series is upstreamed https://lore.kernel.org/lkml/20200827063522.2563293-1-lokeshgi...@google.com/ > > The value "0" was motivated by the concern that uffd can enlarge the > race window for use after free by providing one more additional way to > block kernel faults, but value "2" is already enough to solve that > concern completely and it'll be the default on all Android. > > In other words by adding "2" you're effectively doing a more > finegrined and more optimal implementation of "0" that remains useful > and available to unprivileged apps and it already resolves all > "robustness against side effects other kernel bugs" concerns. Clearly > "0" is even more secure statistically but that would apply to every > other syscall including vmsplice, and there's no > /proc/sys/vm/unprivileged_vmsplice sysctl out there. > > The next issue we have now is with the pipe mutex (which is not a > major concern but we need to solve it somehow for correctness). So
Re: [PATCH v7 2/3] fpga: dfl: create a dfl bus type to support DFL devices
Hi Xu, On Wed, Aug 19, 2020 at 03:45:20PM +0800, Xu Yilun wrote: > A new bus type "dfl" is introduced for private features which are not > initialized by DFL feature drivers (dfl-fme & dfl-afu drivers). So these > private features could be handled by separate driver modules. > > DFL feature drivers (dfl-fme, dfl-port) will create DFL devices on > enumeration. DFL drivers could be registered on this bus to match these > DFL devices. They are matched by dfl type & feature_id. > > Signed-off-by: Xu Yilun > Signed-off-by: Wu Hao > Signed-off-by: Matthew Gerlach > Signed-off-by: Russ Weight > Reviewed-by: Tom Rix > Acked-by: Wu Hao > --- > v2: change the bus uevent format. > change the dfl device's sysfs name format. > refactor dfl_dev_add(). > minor fixes for comments from Hao and Tom. > v3: no change. > v4: improve the uevent format, 4 bits for type & 12 bits for id. > change dfl_device->type to u8. > A dedicate field in struct dfl_feature for dfl device instance. > error out if dfl_device already exist on dfl_devs_init(). > v5: minor fixes for Hao's comments > v6: the input param of dfl_devs_add() changes to struct > dfl_feature_platform_data. > improve the comments. > v7: no change. > --- > Documentation/ABI/testing/sysfs-bus-dfl | 15 ++ > drivers/fpga/dfl.c | 262 > +++- > drivers/fpga/dfl.h | 86 +++ > 3 files changed, 355 insertions(+), 8 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-bus-dfl > > diff --git a/Documentation/ABI/testing/sysfs-bus-dfl > b/Documentation/ABI/testing/sysfs-bus-dfl > new file mode 100644 > index 000..23543be > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-bus-dfl > @@ -0,0 +1,15 @@ > +What:/sys/bus/dfl/devices/dfl_dev.X/type > +Date:Aug 2020 > +KernelVersion: 5.10 > +Contact: Xu Yilun > +Description: Read-only. It returns type of DFL FIU of the device. Now DFL > + supports 2 FIU types, 0 for FME, 1 for PORT. > + Format: 0x%x > + > +What:/sys/bus/dfl/devices/dfl_dev.X/feature_id > +Date:Aug 2020 > +KernelVersion: 5.10 > +Contact: Xu Yilun > +Description: Read-only. It returns feature identifier local to its DFL FIU > + type. > + Format: 0x%x > diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c > index 52cafa2..c5ba4ac9 100644 > --- a/drivers/fpga/dfl.c > +++ b/drivers/fpga/dfl.c > @@ -30,12 +30,6 @@ static DEFINE_MUTEX(dfl_id_mutex); > * index to dfl_chardevs table. If no chardev support just set devt_type > * as one invalid index (DFL_FPGA_DEVT_MAX). > */ > -enum dfl_id_type { > - FME_ID, /* fme id allocation and mapping */ > - PORT_ID,/* port id allocation and mapping */ > - DFL_ID_MAX, > -}; > - > enum dfl_fpga_devt_type { > DFL_FPGA_DEVT_FME, > DFL_FPGA_DEVT_PORT, > @@ -250,6 +244,244 @@ int dfl_fpga_check_port_id(struct platform_device > *pdev, void *pport_id) > } > EXPORT_SYMBOL_GPL(dfl_fpga_check_port_id); > > +static DEFINE_IDA(dfl_device_ida); > + > +static const struct dfl_device_id * > +dfl_match_one_device(const struct dfl_device_id *id, struct dfl_device *ddev) > +{ > + if (id->type == ddev->type && id->feature_id == ddev->feature_id) > + return id; > + > + return NULL; > +} > + > +static int dfl_bus_match(struct device *dev, struct device_driver *drv) > +{ > + struct dfl_device *ddev = to_dfl_dev(dev); > + struct dfl_driver *ddrv = to_dfl_drv(drv); > + const struct dfl_device_id *id_entry = ddrv->id_table; > + > + if (id_entry) { > + while (id_entry->feature_id) { > + if (dfl_match_one_device(id_entry, ddev)) { > + ddev->id_entry = id_entry; > + return 1; > + } > + id_entry++; > + } > + } > + > + return 0; > +} > + > +static int dfl_bus_probe(struct device *dev) > +{ > + struct dfl_device *ddev = to_dfl_dev(dev); > + struct dfl_driver *ddrv = to_dfl_drv(dev->driver); Can you swap those for reverse x-mas tree where possible? struct dfl_driver *ddrv = to_dfl_drv(dev->driver); struct dfl_device *ddev = to_dfl_dev(dev); ... > + > + return ddrv->probe(ddev); > +} > + > +static int dfl_bus_remove(struct device *dev) > +{ > + struct dfl_device *ddev = to_dfl_dev(dev); > + struct dfl_driver *ddrv = to_dfl_drv(dev->driver); Same here. > + > + if (ddrv->remove) > + ddrv->remove(ddev); > + > + return 0; > +} > + > +static int dfl_bus_uevent(struct device *dev, struct kobj_uevent_env *env) > +{ > + struct dfl_device *ddev = to_dfl_dev(dev); > + > + /* The type has 4 valid bits and feature_id has 12 valid bits */ > + return add_uevent_var(env, "MODALIAS=dfl:t%01Xf%03X", > +
RE: [RFC v2 07/11] hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
From: Boqun Feng Sent: Tuesday, September 1, 2020 8:01 PM > > When communicating with Hyper-V, HV_HYP_PAGE_SIZE should be used since > that's the page size used by Hyper-V and Hyper-V expects all > page-related data using the unit of HY_HYP_PAGE_SIZE, for example, the > "pfn" in hv_page_buffer is actually the HV_HYP_PAGE (i.e. the Hyper-V > page) number. > > In order to support guest whose page size is not 4k, we need to make > hv_netvsc always use HV_HYP_PAGE_SIZE for Hyper-V communication. > > Signed-off-by: Boqun Feng > --- > drivers/net/hyperv/netvsc.c | 2 +- > drivers/net/hyperv/netvsc_drv.c | 46 +++ > drivers/net/hyperv/rndis_filter.c | 12 > 3 files changed, 30 insertions(+), 30 deletions(-) > > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c > index 41f5cf0bb997..1d6f2256da6b 100644 > --- a/drivers/net/hyperv/netvsc.c > +++ b/drivers/net/hyperv/netvsc.c > @@ -794,7 +794,7 @@ static void netvsc_copy_to_send_buf(struct netvsc_device > *net_device, > } > > for (i = 0; i < page_count; i++) { > - char *src = phys_to_virt(pb[i].pfn << PAGE_SHIFT); > + char *src = phys_to_virt(pb[i].pfn << HV_HYP_PAGE_SHIFT); > u32 offset = pb[i].offset; > u32 len = pb[i].len; > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > index 64b0a74c1523..61ea568e1ddf 100644 > --- a/drivers/net/hyperv/netvsc_drv.c > +++ b/drivers/net/hyperv/netvsc_drv.c > @@ -373,32 +373,29 @@ static u16 netvsc_select_queue(struct net_device *ndev, > struct > sk_buff *skb, > return txq; > } > > -static u32 fill_pg_buf(struct page *page, u32 offset, u32 len, > +static u32 fill_pg_buf(unsigned long hvpfn, u32 offset, u32 len, > struct hv_page_buffer *pb) > { > int j = 0; > > - /* Deal with compound pages by ignoring unused part > - * of the page. > - */ > - page += (offset >> PAGE_SHIFT); > - offset &= ~PAGE_MASK; > + hvpfn += offset >> HV_HYP_PAGE_SHIFT; > + offset = offset & ~HV_HYP_PAGE_MASK; > > while (len > 0) { > unsigned long bytes; > > - bytes = PAGE_SIZE - offset; > + bytes = HV_HYP_PAGE_SIZE - offset; > if (bytes > len) > bytes = len; > - pb[j].pfn = page_to_pfn(page); > + pb[j].pfn = hvpfn; > pb[j].offset = offset; > pb[j].len = bytes; > > offset += bytes; > len -= bytes; > > - if (offset == PAGE_SIZE && len) { > - page++; > + if (offset == HV_HYP_PAGE_SIZE && len) { > + hvpfn++; > offset = 0; > j++; > } > @@ -421,23 +418,26 @@ static u32 init_page_array(void *hdr, u32 len, struct > sk_buff *skb, >* 2. skb linear data >* 3. skb fragment data >*/ > - slots_used += fill_pg_buf(virt_to_page(hdr), > - offset_in_page(hdr), > - len, [slots_used]); > + slots_used += fill_pg_buf(virt_to_hvpfn(hdr), > + offset_in_hvpage(hdr), > + len, > + [slots_used]); > > packet->rmsg_size = len; > packet->rmsg_pgcnt = slots_used; > > - slots_used += fill_pg_buf(virt_to_page(data), > - offset_in_page(data), > - skb_headlen(skb), [slots_used]); > + slots_used += fill_pg_buf(virt_to_hvpfn(data), > + offset_in_hvpage(data), > + skb_headlen(skb), > + [slots_used]); > > for (i = 0; i < frags; i++) { > skb_frag_t *frag = skb_shinfo(skb)->frags + i; > > - slots_used += fill_pg_buf(skb_frag_page(frag), > - skb_frag_off(frag), > - skb_frag_size(frag), [slots_used]); > + slots_used += fill_pg_buf(page_to_hvpfn(skb_frag_page(frag)), > + skb_frag_off(frag), > + skb_frag_size(frag), > + [slots_used]); > } > return slots_used; > } > @@ -453,8 +453,8 @@ static int count_skb_frag_slots(struct sk_buff *skb) > unsigned long offset = skb_frag_off(frag); > > /* Skip unused frames from start of page */ > - offset &= ~PAGE_MASK; > - pages += PFN_UP(offset + size); > + offset &= ~HV_HYP_PAGE_MASK; > + pages += HVPFN_UP(offset + size); > } > return pages; > } > @@ -462,12 +462,12 @@ static int count_skb_frag_slots(struct sk_buff *skb) > static int netvsc_get_slots(struct sk_buff *skb) > {
Re: [PATCH v1 01/12] fpga: fpga security manager class driver
Hi Russ, On Fri, Sep 04, 2020 at 04:52:54PM -0700, Russ Weight wrote: > Create the Intel Security Manager class driver. The security > manager provides interfaces to manage secure updates for the > FPGA and BMC images that are stored in FLASH. The driver can > also be used to update root entry hashes and to cancel code > signing keys. > > This patch creates the class driver and provides sysfs > interfaces for displaying root entry hashes, canceled code > signing keys and flash counts. > > Signed-off-by: Russ Weight > Signed-off-by: Xu Yilun As for Reviewed-by tags I had seen on other patches in the series, I'd prefer for that to happen on public mailing lists. If Hao reviewed patches on some internal Intel list I won't know about it, so please have him properly Ack/Reviewed-by tag things on a public mailing list. > --- > .../ABI/testing/sysfs-class-ifpga-sec-mgr | 75 > MAINTAINERS | 8 + > drivers/fpga/Kconfig | 9 + > drivers/fpga/Makefile | 3 + > drivers/fpga/ifpga-sec-mgr.c | 339 ++ > include/linux/fpga/ifpga-sec-mgr.h| 145 > 6 files changed, 579 insertions(+) > create mode 100644 Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr > create mode 100644 drivers/fpga/ifpga-sec-mgr.c > create mode 100644 include/linux/fpga/ifpga-sec-mgr.h > > diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr > b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr > new file mode 100644 > index ..86f8992559bf > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr > @@ -0,0 +1,75 @@ > +What:/sys/class/ifpga_sec_mgr/ifpga_secX/name > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Name of low level fpga security manager driver. > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/sr_root_entry_hash > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns the root entry hash for the static > + region if one is programmed, else it returns the > + string: "hash not programmed". This file is only > + visible if the underlying device supports it. > + Format: "0x%x". > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/pr_root_entry_hash > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns the root entry hash for the partial > + reconfiguration region if one is programmed, else it > + returns the string: "hash not programmed". This file > + is only visible if the underlying device supports it. > + Format: "0x%x". > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_root_entry_hash > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns the root entry hash for the BMC image > + if one is programmed, else it returns the string: > + "hash not programmed". This file is only visible if the > + underlying device supports it. > + Format: "0x%x". > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/sr_canceled_csks > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns a list of indices for canceled code > + signing keys for the static region. The standard bitmap > + list format is used (e.g. "1,2-6,9"). > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/pr_canceled_csks > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns a list of indices for canceled code > + signing keys for the partial reconfiguration region. The > + standard bitmap list format is used (e.g. "1,2-6,9"). > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_canceled_csks > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns a list of indices for canceled code > + signing keys for the BMC. The standard bitmap list format > + is used (e.g. "1,2-6,9"). > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/user_flash_count > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact: Russ Weight > +Description: Read only. Returns number of times the user image for the > + static region has been flashed. > + Format: "%d". > + > +What: > /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_flash_count > +Date:Sep 2020 > +KernelVersion: 5.10 > +Contact:
RE: [RFC v2 03/11] Drivers: hv: vmbus: Introduce types of GPADL
From: Boqun Feng Sent: Tuesday, September 1, 2020 8:01 PM > > This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The > types of GPADL are purely the concept in the guest, IOW the hypervisor > treat them as the same. > > The reason of introducing the types of GPADL is to support guests whose s/of/for/ > page size is not 4k (the page size of Hyper-V hypervisor). In these > guests, both the headers and the data parts of the ringbuffers need to > be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be > mapped into userspace and 2) we use "double mapping" mechanism to > support fast wrap-around, and "double mapping" relies on ringbuffers > being page-aligned. However, the Hyper-V hypervisor only uses 4k > (HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make > the headers of ringbuffers take one guest page and when GPADL is > established between the guest and hypervisor, the only first 4k of > header is used. To handle this special case, we need the types of GPADL > to differ different guest memory usage for GPADL. > > Type enum is introduced along with several general interfaces to > describe the differences between normal buffer GPADL and ringbuffer > GPADL. > > Signed-off-by: Boqun Feng > --- > drivers/hv/channel.c | 159 +++-- > include/linux/hyperv.h | 44 +++- > 2 files changed, 182 insertions(+), 21 deletions(-) > > diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c > index 1cbe8fc931fc..7c443fd567e4 100644 > --- a/drivers/hv/channel.c > +++ b/drivers/hv/channel.c > @@ -35,6 +35,98 @@ static unsigned long virt_to_hvpfn(void *addr) > return paddr >> HV_HYP_PAGE_SHIFT; > } > > +/* > + * hv_gpadl_size - Return the real size of a gpadl, the size that Hyper-V > uses > + * > + * For BUFFER gpadl, Hyper-V uses the exact same size as the guest does. > + * > + * For RING gpadl, in each ring, the guest uses one PAGE_SIZE as the header > + * (because of the alignment requirement), however, the hypervisor only > + * uses the first HV_HYP_PAGE_SIZE as the header, therefore leaving a > + * (PAGE_SIZE - HV_HYP_PAGE_SIZE) gap. And since there are two rings in a > + * ringbuffer, So the total size for a RING gpadl that Hyper-V uses is the Unneeded word "So" > + * total size that the guest uses minus twice of the gap size. > + */ > +static inline u32 hv_gpadl_size(enum hv_gpadl_type type, u32 size) > +{ > + switch (type) { > + case HV_GPADL_BUFFER: > + return size; > + case HV_GPADL_RING: > + /* The size of a ringbuffer must be page-aligned */ > + BUG_ON(size % PAGE_SIZE); > + /* > + * Two things to notice here: > + * 1) We're processing two ring buffers as a unit > + * 2) We're skipping any space larger than HV_HYP_PAGE_SIZE in > + * the first guest-size page of each of the two ring buffers. > + * So we effectively subtract out two guest-size pages, and add > + * back two Hyper-V size pages. > + */ > + return size - 2 * (PAGE_SIZE - HV_HYP_PAGE_SIZE); > + } > + BUG(); > + return 0; > +} > + > +/* > + * hv_ring_gpadl_send_offset - Calculate the send offset in a ring gpadl > based > + * on the offset in the guest > + * > + * @send_offset: the offset (in bytes) where the send ringbuffer starts in > the > + * virtual address space of the guest > + */ > +static inline u32 hv_ring_gpadl_send_offset(u32 send_offset) > +{ > + > + /* > + * For RING gpadl, in each ring, the guest uses one PAGE_SIZE as the > + * header (because of the alignment requirement), however, the > + * hypervisor only uses the first HV_HYP_PAGE_SIZE as the header, > + * therefore leaving a (PAGE_SIZE - HV_HYP_PAGE_SIZE) gap. > + * > + * And to calculate the effective send offset in gpadl, we need to > + * substract this gap. > + */ > + return send_offset - (PAGE_SIZE - HV_HYP_PAGE_SIZE); > +} > + > +/* > + * hv_gpadl_hvpfn - Return the Hyper-V page PFN of the @i th Hyper-V page in > + * the gpadl > + * > + * @type: the type of the gpadl > + * @kbuffer: the pointer to the gpadl in the guest > + * @size: the total size (in bytes) of the gpadl > + * @send_offset: the offset (in bytes) where the send ringbuffer starts in > the > + * virtual address space of the guest > + * @i: the index > + */ > +static inline u64 hv_gpadl_hvpfn(enum hv_gpadl_type type, void *kbuffer, > + u32 size, u32 send_offset, int i) > +{ > + int send_idx = hv_ring_gpadl_send_offset(send_offset) >> > HV_HYP_PAGE_SHIFT; > + unsigned long delta = 0UL; > + > + switch (type) { > + case HV_GPADL_BUFFER: > + break; > + case HV_GPADL_RING: > + if (i == 0) > + delta = 0; > +
[RFC PATCH] fork: Free per-cpu cached vmalloc'ed thread stacks with
The per-cpu cached vmalloc'ed stacks are currently freed in the CPU hotplug teardown path by the free_vm_stack_cache() callback, which invokes vfree(), which may result in purging the list of lazily freed vmap areas. Purging all of the lazily freed vmap areas can take a long time when the list of vmap areas is large. This is problematic, as free_vm_stack_cache() is invoked prior to the offline CPU's timers being migrated. This is not desirable as it can lead to timer migration delays in the CPU hotplug teardown path, and timer callbacks will be invoked long after the timer has expired. For example, on a system that has only one online CPU (CPU 1) that is running a heavy workload, and another CPU that is being offlined, the online CPU will invoke free_vm_stack_cache() to free the cached vmalloc'ed stacks for the CPU being offlined. When there are 2702 vmap areas that total to 13498 pages, free_vm_stack_cache() takes over 2 seconds to execute: [001] 399.335808: cpuhp_enter: cpu: 0005 target: 0 step: 67 (free_vm_stack_cache) /* The first vmap area to be freed */ [001] 399.337157: __purge_vmap_area_lazy: [0:2702] 0xffc033da8000 - 0xffc033dad000 (5 : 13498) /* After two seconds */ [001] 401.528010: __purge_vmap_area_lazy: [1563:2702] 0xffc02fe1 - 0xffc02fe15000 (5 : 5765) Instead of freeing the per-cpu cached vmalloc'ed stacks synchronously with respect to the CPU hotplug teardown state machine, free them asynchronously to help move along the CPU hotplug teardown state machine quickly. Signed-off-by: Isaac J. Manjarres --- kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index 4d32190..68346a0 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -202,7 +202,7 @@ static int free_vm_stack_cache(unsigned int cpu) if (!vm_stack) continue; - vfree(vm_stack->addr); + vfree_atomic(vm_stack->addr); cached_vm_stacks[i] = NULL; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH net-next 9/9] net: ethernet: ti: ale: add support for multi port k3 cpsw versions
On Sat, 5 Sep 2020 02:09:24 +0300 Grygorii Strashko wrote: > The TI J721E (CPSW9g) ALE version is similar, in general, to Sitara AM3/4/5 > CPSW ALE, but has more extended functions and different ALE VLAN entry > format. > > This patch adds support for for multi port TI J721E (CPSW9g) ALE variant. and: drivers/net/ethernet/ti/cpsw_ale.c:195:28: warning: symbol 'vlan_entry_k3_cpswxg' was not declared. Should it be static?
Re: [PATCH net-next 8/9] net: ethernet: ti: ale: switch to use tables for vlan entry description
On Sat, 5 Sep 2020 02:09:23 +0300 Grygorii Strashko wrote: > The ALE VLAN entries are too much differ between different TI CPSW ALE > versions. So, handling them using flags, defines and get/set functions > became over-complicated. > > This patch introduces tables to describe the ALE VLAN entries fields, which > are different between TI CPSW ALE versions, and new get/set access > functions. It also allows to detect incorrect access to not available ALL > entry fields. When building with W=1 C=1: drivers/net/ethernet/ti/cpsw_ale.c:179:28: warning: symbol 'vlan_entry_cpsw' was not declared. Should it be static? drivers/net/ethernet/ti/cpsw_ale.c:187:28: warning: symbol 'vlan_entry_nu' was not declared. Should it be static? drivers/net/ethernet/ti/cpsw_ale.c:63: warning: Function parameter or member 'num_bits' not described in 'ale_entry_fld'
Re: [PATCH 14/20] usb/phy: mxs-usb: Use pm_ptr() macro
On 20-09-03 13:25:48, Paul Cercueil wrote: > Use the newly introduced pm_ptr() macro, and mark the suspend/resume > functions __maybe_unused. These functions can then be moved outside the > CONFIG_PM_SUSPEND block, and the compiler can then process them and > detect build failures independently of the config. If unused, they will > simply be discarded by the compiler. > > Signed-off-by: Paul Cercueil > --- > drivers/usb/phy/phy-mxs-usb.c | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/usb/phy/phy-mxs-usb.c b/drivers/usb/phy/phy-mxs-usb.c > index 67b39dc62b37..c5e32d51563f 100644 > --- a/drivers/usb/phy/phy-mxs-usb.c > +++ b/drivers/usb/phy/phy-mxs-usb.c > @@ -815,8 +815,8 @@ static int mxs_phy_remove(struct platform_device *pdev) > return 0; > } > > -#ifdef CONFIG_PM_SLEEP > -static void mxs_phy_enable_ldo_in_suspend(struct mxs_phy *mxs_phy, bool on) > +static void __maybe_unused > +mxs_phy_enable_ldo_in_suspend(struct mxs_phy *mxs_phy, bool on) > { > unsigned int reg = on ? ANADIG_ANA_MISC0_SET : ANADIG_ANA_MISC0_CLR; > > @@ -832,7 +832,7 @@ static void mxs_phy_enable_ldo_in_suspend(struct mxs_phy > *mxs_phy, bool on) > reg, BM_ANADIG_ANA_MISC0_STOP_MODE_CONFIG_SL); > } > > -static int mxs_phy_system_suspend(struct device *dev) > +static int __maybe_unused mxs_phy_system_suspend(struct device *dev) > { > struct mxs_phy *mxs_phy = dev_get_drvdata(dev); > > @@ -842,7 +842,7 @@ static int mxs_phy_system_suspend(struct device *dev) > return 0; > } > > -static int mxs_phy_system_resume(struct device *dev) > +static int __maybe_unused mxs_phy_system_resume(struct device *dev) > { > struct mxs_phy *mxs_phy = dev_get_drvdata(dev); > > @@ -851,7 +851,6 @@ static int mxs_phy_system_resume(struct device *dev) > > return 0; > } > -#endif /* CONFIG_PM_SLEEP */ > > static SIMPLE_DEV_PM_OPS(mxs_phy_pm, mxs_phy_system_suspend, > mxs_phy_system_resume); > @@ -862,7 +861,7 @@ static struct platform_driver mxs_phy_driver = { > .driver = { > .name = DRIVER_NAME, > .of_match_table = mxs_phy_dt_ids, > - .pm = _phy_pm, > + .pm = pm_ptr(_phy_pm), >}, > }; > > -- Acked-by: Peter Chen -- Thanks, Peter Chen
Re: [PATCH v1 02/12] fpga: create intel max10 bmc security engine
On 9/4/20 5:01 PM, Randy Dunlap wrote: On 9/4/20 4:52 PM, Russ Weight wrote: diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig index 97c0a6cc2ba7..0f0bed68e618 100644 --- a/drivers/fpga/Kconfig +++ b/drivers/fpga/Kconfig @@ -244,4 +244,15 @@ config IFPGA_SEC_MGR region and for the BMC. Select this option to enable updates for secure FPGA devices. +config IFPGA_M10_BMC_SECURE +tristate "Intel MAX10 BMC security engine" + depends on MFD_INTEL_M10_BMC && IFPGA_SEC_MGR +help + Secure update support for the Intel MAX10 board management + controller. Please consistently use one tab to indent Kconfig keywords (tristate, depends, help) and one tab + 2 spaces to indent help text. (as in Documentation/process/coding-style.rst) Thanks for the feedback. I'll fix these. + + This is a subdriver of the Intel MAX10 board management controller + (BMC) and provides support for secure updates for the BMC image, + the FPGA image, the Root Entry Hashes, etc. + endif # FPGA thanks.
RE: [RFC v2 02/11] Drivers: hv: vmbus: Move __vmbus_open()
From: Boqun Feng Sent: Tuesday, September 1, 2020 8:01 PM > > Pure function movement, no functional changes. The move is made, because > in a later change, __vmbus_open() will rely on some static functions > afterwards, so we sperate the move and the modification of s/sperate/separate/ > __vmbus_open() in two patches to make it easy to review. > > Signed-off-by: Boqun Feng > Reviewed-by: Wei Liu > --- > drivers/hv/channel.c | 309 ++- > 1 file changed, 155 insertions(+), 154 deletions(-) >
Re: [PATCH 06/20] usb/chipidea: core: Use pm_ptr() macro
On 20-09-03 13:25:40, Paul Cercueil wrote: > Use the newly introduced pm_ptr() macro, and mark the suspend/resume > functions __maybe_unused. These functions can then be moved outside the > CONFIG_PM_SUSPEND block, and the compiler can then process them and > detect build failures independently of the config. If unused, they will > simply be discarded by the compiler. For using __maybe_unused or using MACRO, it depends. The chipidea core has many functions are only used for power management, you need to add __maybe_unused for everyone of them, I still prefer using MACRO. Peter > > Signed-off-by: Paul Cercueil > --- > drivers/usb/chipidea/core.c | 26 +++--- > 1 file changed, 11 insertions(+), 15 deletions(-) > > diff --git a/drivers/usb/chipidea/core.c b/drivers/usb/chipidea/core.c > index aa40e510b806..af64ab98fb56 100644 > --- a/drivers/usb/chipidea/core.c > +++ b/drivers/usb/chipidea/core.c > @@ -1231,9 +1231,8 @@ static int ci_hdrc_remove(struct platform_device *pdev) > return 0; > } > > -#ifdef CONFIG_PM > /* Prepare wakeup by SRP before suspend */ > -static void ci_otg_fsm_suspend_for_srp(struct ci_hdrc *ci) > +static void __maybe_unused ci_otg_fsm_suspend_for_srp(struct ci_hdrc *ci) > { > if ((ci->fsm.otg->state == OTG_STATE_A_IDLE) && > !hw_read_otgsc(ci, OTGSC_ID)) { > @@ -1245,7 +1244,7 @@ static void ci_otg_fsm_suspend_for_srp(struct ci_hdrc > *ci) > } > > /* Handle SRP when wakeup by data pulse */ > -static void ci_otg_fsm_wakeup_by_srp(struct ci_hdrc *ci) > +static void __maybe_unused ci_otg_fsm_wakeup_by_srp(struct ci_hdrc *ci) > { > if ((ci->fsm.otg->state == OTG_STATE_A_IDLE) && > (ci->fsm.a_bus_drop == 1) && (ci->fsm.a_bus_req == 0)) { > @@ -1259,7 +1258,7 @@ static void ci_otg_fsm_wakeup_by_srp(struct ci_hdrc *ci) > } > } > > -static void ci_controller_suspend(struct ci_hdrc *ci) > +static void __maybe_unused ci_controller_suspend(struct ci_hdrc *ci) > { > disable_irq(ci->irq); > ci_hdrc_enter_lpm(ci, true); > @@ -1277,7 +1276,7 @@ static void ci_controller_suspend(struct ci_hdrc *ci) > * interrupt (wakeup int) only let the controller be out of > * low power mode, but not handle any interrupts. > */ > -static void ci_extcon_wakeup_int(struct ci_hdrc *ci) > +static void __maybe_unused ci_extcon_wakeup_int(struct ci_hdrc *ci) > { > struct ci_hdrc_cable *cable_id, *cable_vbus; > u32 otgsc = hw_read_otgsc(ci, ~0); > @@ -1294,7 +1293,7 @@ static void ci_extcon_wakeup_int(struct ci_hdrc *ci) > ci_irq(ci->irq, ci); > } > > -static int ci_controller_resume(struct device *dev) > +static int __maybe_unused ci_controller_resume(struct device *dev) > { > struct ci_hdrc *ci = dev_get_drvdata(dev); > int ret; > @@ -1332,8 +1331,7 @@ static int ci_controller_resume(struct device *dev) > return 0; > } > > -#ifdef CONFIG_PM_SLEEP > -static int ci_suspend(struct device *dev) > +static int __maybe_unused ci_suspend(struct device *dev) > { > struct ci_hdrc *ci = dev_get_drvdata(dev); > > @@ -1366,7 +1364,7 @@ static int ci_suspend(struct device *dev) > return 0; > } > > -static int ci_resume(struct device *dev) > +static int __maybe_unused ci_resume(struct device *dev) > { > struct ci_hdrc *ci = dev_get_drvdata(dev); > int ret; > @@ -1386,9 +1384,8 @@ static int ci_resume(struct device *dev) > > return ret; > } > -#endif /* CONFIG_PM_SLEEP */ > > -static int ci_runtime_suspend(struct device *dev) > +static int __maybe_unused ci_runtime_suspend(struct device *dev) > { > struct ci_hdrc *ci = dev_get_drvdata(dev); > > @@ -1408,13 +1405,12 @@ static int ci_runtime_suspend(struct device *dev) > return 0; > } > > -static int ci_runtime_resume(struct device *dev) > +static int __maybe_unused ci_runtime_resume(struct device *dev) > { > return ci_controller_resume(dev); > } > > -#endif /* CONFIG_PM */ > -static const struct dev_pm_ops ci_pm_ops = { > +static const struct dev_pm_ops __maybe_unused ci_pm_ops = { > SET_SYSTEM_SLEEP_PM_OPS(ci_suspend, ci_resume) > SET_RUNTIME_PM_OPS(ci_runtime_suspend, ci_runtime_resume, NULL) > }; > @@ -1424,7 +1420,7 @@ static struct platform_driver ci_hdrc_driver = { > .remove = ci_hdrc_remove, > .driver = { > .name = "ci_hdrc", > - .pm = _pm_ops, > + .pm = pm_ptr(_pm_ops), > .dev_groups = ci_groups, > }, > }; > -- > 2.28.0 > -- Thanks, Peter Chen
Re: [PATCH v1 02/12] fpga: create intel max10 bmc security engine
On 9/4/20 4:52 PM, Russ Weight wrote: > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig > index 97c0a6cc2ba7..0f0bed68e618 100644 > --- a/drivers/fpga/Kconfig > +++ b/drivers/fpga/Kconfig > @@ -244,4 +244,15 @@ config IFPGA_SEC_MGR > region and for the BMC. Select this option to enable > updates for secure FPGA devices. > > +config IFPGA_M10_BMC_SECURE > +tristate "Intel MAX10 BMC security engine" > + depends on MFD_INTEL_M10_BMC && IFPGA_SEC_MGR > +help > + Secure update support for the Intel MAX10 board management > + controller. Please consistently use one tab to indent Kconfig keywords (tristate, depends, help) and one tab + 2 spaces to indent help text. (as in Documentation/process/coding-style.rst) > + > + This is a subdriver of the Intel MAX10 board management controller > + (BMC) and provides support for secure updates for the BMC image, > + the FPGA image, the Root Entry Hashes, etc. > + > endif # FPGA thanks. -- ~Randy
Re: [PATCH v1 01/12] fpga: fpga security manager class driver
On 9/4/20 4:52 PM, Russ Weight wrote: > diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig > index 88f64fbf55e3..97c0a6cc2ba7 100644 > --- a/drivers/fpga/Kconfig > +++ b/drivers/fpga/Kconfig > @@ -235,4 +235,13 @@ config FPGA_MGR_ZYNQMP_FPGA > to configure the programmable logic(PL) through PS > on ZynqMP SoC. > > +config IFPGA_SEC_MGR > + tristate "Intel Security Manager for FPGA" > +help Use one tab instead of spaces to indent "help". > + The Intel Security Manager class driver presents a common > + user API for managing secure updates for Intel FPGA > + devices, including flash images for the FPGA static > + region and for the BMC. Select this option to enable > + updates for secure FPGA devices. > + > endif # FPGA thanks. -- ~Randy
[PATCH v1 03/12] fpga: expose max10 flash update counts in sysfs
Extend the MAX10 BMC Security Engine driver to provide a handler to expose the flash update count for the FPGA user image. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- drivers/fpga/intel-m10-bmc-secure.c | 32 + 1 file changed, 32 insertions(+) diff --git a/drivers/fpga/intel-m10-bmc-secure.c b/drivers/fpga/intel-m10-bmc-secure.c index 1f86bfb694b4..b824790e43aa 100644 --- a/drivers/fpga/intel-m10-bmc-secure.c +++ b/drivers/fpga/intel-m10-bmc-secure.c @@ -10,6 +10,7 @@ #include #include #include +#include #include struct m10bmc_sec { @@ -99,7 +100,38 @@ SYSFS_GET_REH(bmc, BMC_REH_ADDR) SYSFS_GET_REH(sr, SR_REH_ADDR) SYSFS_GET_REH(pr, PR_REH_ADDR) +#define FLASH_COUNT_SIZE 4096 +#define USER_FLASH_COUNT 0x17ffb000 + +static int get_qspi_flash_count(struct ifpga_sec_mgr *imgr) +{ + struct m10bmc_sec *sec = imgr->priv; + unsigned int stride = regmap_get_reg_stride(sec->m10bmc->regmap); + unsigned int cnt, num_bits = FLASH_COUNT_SIZE * 8; + u8 *flash_buf; + int ret; + + flash_buf = kmalloc(FLASH_COUNT_SIZE, GFP_KERNEL); + if (!flash_buf) + return -ENOMEM; + + ret = m10bmc_raw_bulk_read(sec->m10bmc, USER_FLASH_COUNT, flash_buf, + FLASH_COUNT_SIZE / stride); + if (ret) { + dev_err(sec->dev, "%s failed to read %d\n", __func__, ret); + goto exit_free; + } + + cnt = num_bits - bitmap_weight((unsigned long *)flash_buf, num_bits); + +exit_free: + kfree(flash_buf); + + return ret ? : cnt; +} + static const struct ifpga_sec_mgr_ops m10bmc_iops = { + .user_flash_count = get_qspi_flash_count, .bmc_root_entry_hash = get_bmc_root_entry_hash, .sr_root_entry_hash = get_sr_root_entry_hash, .pr_root_entry_hash = get_pr_root_entry_hash, -- 2.17.1
[PATCH v1 11/12] fpga: expose hardware error info in sysfs
Extend the Intel Security Manager class driver to include an optional update/hw_errinfo sysfs node that can be used to retrieve 64 bits of device specific error information following a secure update failure. The underlying driver must provide a get_hw_errinfo() callback function to enable this feature. This data is treated as opaque by the class driver. It is left to user-space software or support personnel to interpret this data. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 14 +++ drivers/fpga/ifpga-sec-mgr.c | 38 +++ include/linux/fpga/ifpga-sec-mgr.h| 5 +++ 3 files changed, 57 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr index 762a7dee9453..20bde1abb5e4 100644 --- a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -135,3 +135,17 @@ Description: Read-only. Returns a string describing the failure idle state. If this file is read while a secure update is in progress, then the read will fail with EBUSY. + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/hw_errinfo +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read-only. Returns a 64 bit error value providing + hardware specific information that may be useful in + debugging errors that occur during FPGA image updates. + This file is only visible if the underlying device + supports it. The hw_errinfo value is only accessible + when the secure update engine is in the idle state. + If this file is read while a secure update is in + progress, then the read will fail with EBUSY. + Format: "0x%llx". diff --git a/drivers/fpga/ifpga-sec-mgr.c b/drivers/fpga/ifpga-sec-mgr.c index afd97c135ebe..6944396eff80 100644 --- a/drivers/fpga/ifpga-sec-mgr.c +++ b/drivers/fpga/ifpga-sec-mgr.c @@ -152,10 +152,17 @@ static void set_error(struct ifpga_sec_mgr *imgr, enum ifpga_sec_err err_code) imgr->err_code = err_code; } +static void set_hw_errinfo(struct ifpga_sec_mgr *imgr) +{ + if (imgr->iops->get_hw_errinfo) + imgr->hw_errinfo = imgr->iops->get_hw_errinfo(imgr); +} + static void ifpga_sec_dev_error(struct ifpga_sec_mgr *imgr, enum ifpga_sec_err err_code) { set_error(imgr, err_code); + set_hw_errinfo(imgr); imgr->iops->cancel(imgr); } @@ -348,6 +355,23 @@ error_show(struct device *dev, struct device_attribute *attr, char *buf) } static DEVICE_ATTR_RO(error); +static ssize_t +hw_errinfo_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct ifpga_sec_mgr *imgr = to_sec_mgr(dev); + int ret; + + mutex_lock(>lock); + if (imgr->progress != IFPGA_SEC_PROG_IDLE) + ret = -EBUSY; + else + ret = sprintf(buf, "0x%llx\n", imgr->hw_errinfo); + mutex_unlock(>lock); + + return ret; +} +static DEVICE_ATTR_RO(hw_errinfo); + static ssize_t remaining_size_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -382,6 +406,7 @@ static ssize_t filename_store(struct device *dev, struct device_attribute *attr, imgr->filename[strlen(imgr->filename) - 1] = '\0'; imgr->err_code = IFPGA_SEC_ERR_NONE; + imgr->hw_errinfo = 0; imgr->request_cancel = false; imgr->progress = IFPGA_SEC_PROG_READ_FILE; reinit_completion(>update_done); @@ -416,18 +441,31 @@ static ssize_t cancel_store(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_WO(cancel); +static umode_t +sec_mgr_update_visible(struct kobject *kobj, struct attribute *attr, int n) +{ + struct ifpga_sec_mgr *imgr = to_sec_mgr(kobj_to_dev(kobj)); + + if (attr == _attr_hw_errinfo.attr && !imgr->iops->get_hw_errinfo) + return 0; + + return attr->mode; +} + static struct attribute *sec_mgr_update_attrs[] = { _attr_filename.attr, _attr_cancel.attr, _attr_status.attr, _attr_error.attr, _attr_remaining_size.attr, + _attr_hw_errinfo.attr, NULL, }; static struct attribute_group sec_mgr_update_attr_group = { .name = "update", .attrs = sec_mgr_update_attrs, + .is_visible = sec_mgr_update_visible, }; static ssize_t name_show(struct device *dev, diff --git a/include/linux/fpga/ifpga-sec-mgr.h b/include/linux/fpga/ifpga-sec-mgr.h index f51ed663a723..3be8d8da078a 100644 --- a/include/linux/fpga/ifpga-sec-mgr.h +++ b/include/linux/fpga/ifpga-sec-mgr.h @@ -135,6 +135,9 @@ enum ifpga_sec_err { * function and is called at the
[PATCH v1 12/12] fpga: add max10 get_hw_errinfo callback func
Extend the MAX10 BMC Security Engine driver to include a function that returns 64 bits of additional HW specific data for errors that require additional information. This callback function enables the hw_errinfo sysfs node in the Intel Security Manager class driver. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- drivers/fpga/intel-m10-bmc-secure.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/fpga/intel-m10-bmc-secure.c b/drivers/fpga/intel-m10-bmc-secure.c index 4a66c2d448eb..7fb1c805f654 100644 --- a/drivers/fpga/intel-m10-bmc-secure.c +++ b/drivers/fpga/intel-m10-bmc-secure.c @@ -450,6 +450,30 @@ static enum ifpga_sec_err m10bmc_sec_cancel(struct ifpga_sec_mgr *imgr) return ret ? IFPGA_SEC_ERR_RW_ERROR : IFPGA_SEC_ERR_NONE; } +static u64 m10bmc_sec_hw_errinfo(struct ifpga_sec_mgr *imgr) +{ + struct m10bmc_sec *sec = imgr->priv; + u32 doorbell = 0, auth_result = 0; + u64 hw_errinfo = 0; + + switch (imgr->err_code) { + case IFPGA_SEC_ERR_HW_ERROR: + case IFPGA_SEC_ERR_TIMEOUT: + case IFPGA_SEC_ERR_BUSY: + case IFPGA_SEC_ERR_WEAROUT: + if (!m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, )) + hw_errinfo = (u64)doorbell << 32; + + if (!m10bmc_sys_read(sec->m10bmc, M10BMC_AUTH_RESULT, +_result)) + hw_errinfo |= (u64)auth_result; + + return hw_errinfo; + default: + return 0; + } +} + static const struct ifpga_sec_mgr_ops m10bmc_iops = { .user_flash_count = get_qspi_flash_count, .bmc_root_entry_hash = get_bmc_root_entry_hash, @@ -467,7 +491,8 @@ static const struct ifpga_sec_mgr_ops m10bmc_iops = { .prepare = m10bmc_sec_prepare, .write_blk = m10bmc_sec_write_blk, .poll_complete = m10bmc_sec_poll_complete, - .cancel = m10bmc_sec_cancel + .cancel = m10bmc_sec_cancel, + .get_hw_errinfo = m10bmc_sec_hw_errinfo }; static void ifpga_sec_mgr_uinit(struct m10bmc_sec *sec) -- 2.17.1
[PATCH v1 09/12] fpga: expose sec-mgr update size
Extend the Intel Security Manager class driver to include an update/remaining_size sysfs node that can be read to determine how much data remains to be transferred to the secure update engine. This file can be used to monitor progress during the "writing" phase of an update. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr | 11 +++ drivers/fpga/ifpga-sec-mgr.c| 10 ++ 2 files changed, 21 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr index e7b1b02bf7ee..cf1967f1b3e3 100644 --- a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -98,6 +98,17 @@ Description: Read-only. Returns a string describing the current as it will be signaled by sysfs_notify() on each state change. +What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/remaining_size +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read-only. Returns the size of data that remains to + be written to the secure update engine. The size + value is initialized to the full size of the file + image and the value is updated periodically during + the "writing" phase of the update. + Format: "%u". + What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/error Date: Sep 2020 KernelVersion: 5.10 diff --git a/drivers/fpga/ifpga-sec-mgr.c b/drivers/fpga/ifpga-sec-mgr.c index a7718bd8ee61..4ca5d13e5656 100644 --- a/drivers/fpga/ifpga-sec-mgr.c +++ b/drivers/fpga/ifpga-sec-mgr.c @@ -325,6 +325,15 @@ error_show(struct device *dev, struct device_attribute *attr, char *buf) } static DEVICE_ATTR_RO(error); +static ssize_t remaining_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ifpga_sec_mgr *imgr = to_sec_mgr(dev); + + return sprintf(buf, "%u\n", imgr->remaining_size); +} +static DEVICE_ATTR_RO(remaining_size); + static ssize_t filename_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { @@ -364,6 +373,7 @@ static struct attribute *sec_mgr_update_attrs[] = { _attr_filename.attr, _attr_status.attr, _attr_error.attr, + _attr_remaining_size.attr, NULL, }; -- 2.17.1
[PATCH v1 01/12] fpga: fpga security manager class driver
Create the Intel Security Manager class driver. The security manager provides interfaces to manage secure updates for the FPGA and BMC images that are stored in FLASH. The driver can also be used to update root entry hashes and to cancel code signing keys. This patch creates the class driver and provides sysfs interfaces for displaying root entry hashes, canceled code signing keys and flash counts. Signed-off-by: Russ Weight Signed-off-by: Xu Yilun --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 75 MAINTAINERS | 8 + drivers/fpga/Kconfig | 9 + drivers/fpga/Makefile | 3 + drivers/fpga/ifpga-sec-mgr.c | 339 ++ include/linux/fpga/ifpga-sec-mgr.h| 145 6 files changed, 579 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr create mode 100644 drivers/fpga/ifpga-sec-mgr.c create mode 100644 include/linux/fpga/ifpga-sec-mgr.h diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr new file mode 100644 index ..86f8992559bf --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -0,0 +1,75 @@ +What: /sys/class/ifpga_sec_mgr/ifpga_secX/name +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Name of low level fpga security manager driver. + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/sr_root_entry_hash +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns the root entry hash for the static + region if one is programmed, else it returns the + string: "hash not programmed". This file is only + visible if the underlying device supports it. + Format: "0x%x". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/pr_root_entry_hash +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns the root entry hash for the partial + reconfiguration region if one is programmed, else it + returns the string: "hash not programmed". This file + is only visible if the underlying device supports it. + Format: "0x%x". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_root_entry_hash +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns the root entry hash for the BMC image + if one is programmed, else it returns the string: + "hash not programmed". This file is only visible if the + underlying device supports it. + Format: "0x%x". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/sr_canceled_csks +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns a list of indices for canceled code + signing keys for the static region. The standard bitmap + list format is used (e.g. "1,2-6,9"). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/pr_canceled_csks +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns a list of indices for canceled code + signing keys for the partial reconfiguration region. The + standard bitmap list format is used (e.g. "1,2-6,9"). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_canceled_csks +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns a list of indices for canceled code + signing keys for the BMC. The standard bitmap list format + is used (e.g. "1,2-6,9"). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/user_flash_count +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns number of times the user image for the + static region has been flashed. + Format: "%d". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/security/bmc_flash_count +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read only. Returns number of times the BMC image has been + flashed. + Format: "%d". diff --git a/MAINTAINERS b/MAINTAINERS index deaafb617361..4a2ebe6b120d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6830,6 +6830,14 @@ F: Documentation/fpga/ F: drivers/fpga/ F: include/linux/fpga/ +INTEL FPGA SECURITY MANAGER DRIVERS +M: Russ Weight +L: linux-f...@vger.kernel.org +S: Maintained +F: Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +F: drivers/fpga/ifpga-sec-mgr.c +F: include/linux/fpga/ifpga-sec-mgr.h +
[PATCH v1 04/12] fpga: expose max10 canceled keys in sysfs
Extend the MAX10 BMC Security Engine driver to provide a handler to expose the canceled code signing key (CSK) bit vectors. These use the standard bitmap list format (e.g. 1,2-6,9). Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- drivers/fpga/intel-m10-bmc-secure.c | 60 + 1 file changed, 60 insertions(+) diff --git a/drivers/fpga/intel-m10-bmc-secure.c b/drivers/fpga/intel-m10-bmc-secure.c index b824790e43aa..46cd49a08be0 100644 --- a/drivers/fpga/intel-m10-bmc-secure.c +++ b/drivers/fpga/intel-m10-bmc-secure.c @@ -130,14 +130,74 @@ static int get_qspi_flash_count(struct ifpga_sec_mgr *imgr) return ret ? : cnt; } +#define CSK_BIT_LEN128U +#define CSK_32ARRAY_SIZE(_nbits) DIV_ROUND_UP(_nbits, 32) + +#define SYSFS_GET_CSK_CANCEL_NBITS(_name) \ +static int get_##_name##_csk_cancel_nbits(struct ifpga_sec_mgr *imgr) \ +{ \ + return (int)CSK_BIT_LEN; \ +} + +SYSFS_GET_CSK_CANCEL_NBITS(bmc) +SYSFS_GET_CSK_CANCEL_NBITS(sr) +SYSFS_GET_CSK_CANCEL_NBITS(pr) + +static int get_csk_vector(struct ifpga_sec_mgr *imgr, u32 addr, + unsigned long *csk_map, unsigned int nbits) +{ + unsigned int i, arr_size = CSK_32ARRAY_SIZE(nbits); + struct m10bmc_sec *sec = imgr->priv; + u32 *csk32; + int ret; + + csk32 = vmalloc(arr_size); + if (!csk32) + return -ENOMEM; + + ret = m10bmc_raw_bulk_read(sec->m10bmc, addr, csk32, arr_size); + if (ret) { + dev_err(sec->dev, "%s failed to read %d\n", __func__, ret); + goto vfree_exit; + } + + for (i = 0; i < arr_size; i++) + csk32[i] = le32_to_cpu(csk32[i]); + + bitmap_from_arr32(csk_map, csk32, nbits); + bitmap_complement(csk_map, csk_map, nbits); + +vfree_exit: + vfree(csk32); + return ret; +} + +#define SYSFS_GET_CSK_VEC(_name, _addr) \ +static int get_##_name##_canceled_csks(struct ifpga_sec_mgr *imgr, \ + unsigned long *csk_map, \ + unsigned int nbits) \ +{ return get_csk_vector(imgr, _addr, csk_map, nbits); } + +#define CSK_VEC_OFFSET 0x34 + +SYSFS_GET_CSK_VEC(bmc, BMC_PROG_ADDR + CSK_VEC_OFFSET) +SYSFS_GET_CSK_VEC(sr, SR_PROG_ADDR + CSK_VEC_OFFSET) +SYSFS_GET_CSK_VEC(pr, PR_PROG_ADDR + CSK_VEC_OFFSET) + static const struct ifpga_sec_mgr_ops m10bmc_iops = { .user_flash_count = get_qspi_flash_count, .bmc_root_entry_hash = get_bmc_root_entry_hash, .sr_root_entry_hash = get_sr_root_entry_hash, .pr_root_entry_hash = get_pr_root_entry_hash, + .bmc_canceled_csks = get_bmc_canceled_csks, + .sr_canceled_csks = get_sr_canceled_csks, + .pr_canceled_csks = get_pr_canceled_csks, .bmc_reh_size = get_bmc_reh_size, .sr_reh_size = get_sr_reh_size, .pr_reh_size = get_pr_reh_size, + .bmc_canceled_csk_nbits = get_bmc_csk_cancel_nbits, + .sr_canceled_csk_nbits = get_sr_csk_cancel_nbits, + .pr_canceled_csk_nbits = get_pr_csk_cancel_nbits }; static void ifpga_sec_mgr_uinit(struct m10bmc_sec *sec) -- 2.17.1
[PATCH v1 10/12] fpga: enable sec-mgr update cancel
Extend the Intel Security Manager class driver to include an update/cancel sysfs file that can be written to request that an update be canceled. The write may return EBUSY if the update has progressed to the point that it cannot be canceled by software or ENODEV if there is no update in progress. Signed-off-by: Russ Weight --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 10 drivers/fpga/ifpga-sec-mgr.c | 59 +-- include/linux/fpga/ifpga-sec-mgr.h| 1 + 3 files changed, 66 insertions(+), 4 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr index cf1967f1b3e3..762a7dee9453 100644 --- a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -87,6 +87,16 @@ Description: Write only. Write the filename of an Intel image and Root Entry Hashes, and to cancel Code Signing Keys (CSK). +What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/cancel +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Write-only. Write a "1" to this file to request + that a current update be canceled. This request + will be rejected (EBUSY) if the programming phase + has already started or (ENODEV) if there is no + update in progress. + What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/status Date: Sep 2020 KernelVersion: 5.10 diff --git a/drivers/fpga/ifpga-sec-mgr.c b/drivers/fpga/ifpga-sec-mgr.c index 4ca5d13e5656..afd97c135ebe 100644 --- a/drivers/fpga/ifpga-sec-mgr.c +++ b/drivers/fpga/ifpga-sec-mgr.c @@ -159,6 +159,23 @@ static void ifpga_sec_dev_error(struct ifpga_sec_mgr *imgr, imgr->iops->cancel(imgr); } +static int progress_transition(struct ifpga_sec_mgr *imgr, + enum ifpga_sec_prog new_progress) +{ + int ret = 0; + + mutex_lock(>lock); + if (imgr->request_cancel) { + set_error(imgr, IFPGA_SEC_ERR_CANCELED); + imgr->iops->cancel(imgr); + ret = -ECANCELED; + } else { + update_progress(imgr, new_progress); + } + mutex_unlock(>lock); + return ret; +} + static void progress_complete(struct ifpga_sec_mgr *imgr) { mutex_lock(>lock); @@ -190,16 +207,20 @@ static void ifpga_sec_mgr_update(struct work_struct *work) goto release_fw_exit; } - update_progress(imgr, IFPGA_SEC_PROG_PREPARING); + if (progress_transition(imgr, IFPGA_SEC_PROG_PREPARING)) + goto modput_exit; + ret = imgr->iops->prepare(imgr); if (ret) { ifpga_sec_dev_error(imgr, ret); goto modput_exit; } - update_progress(imgr, IFPGA_SEC_PROG_WRITING); + if (progress_transition(imgr, IFPGA_SEC_PROG_WRITING)) + goto done; + size = imgr->remaining_size; - while (size) { + while (size && !imgr->request_cancel) { blk_size = min_t(u32, size, WRITE_BLOCK_SIZE); size -= blk_size; ret = imgr->iops->write_blk(imgr, offset, blk_size); @@ -212,7 +233,9 @@ static void ifpga_sec_mgr_update(struct work_struct *work) offset += blk_size; } - update_progress(imgr, IFPGA_SEC_PROG_PROGRAMMING); + if (progress_transition(imgr, IFPGA_SEC_PROG_PROGRAMMING)) + goto done; + ret = imgr->iops->poll_complete(imgr); if (ret) { ifpga_sec_dev_error(imgr, ret); @@ -359,6 +382,7 @@ static ssize_t filename_store(struct device *dev, struct device_attribute *attr, imgr->filename[strlen(imgr->filename) - 1] = '\0'; imgr->err_code = IFPGA_SEC_ERR_NONE; + imgr->request_cancel = false; imgr->progress = IFPGA_SEC_PROG_READ_FILE; reinit_completion(>update_done); schedule_work(>work); @@ -369,8 +393,32 @@ static ssize_t filename_store(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_WO(filename); +static ssize_t cancel_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ifpga_sec_mgr *imgr = to_sec_mgr(dev); + bool cancel; + int ret = 0; + + if (kstrtobool(buf, ) || !cancel) + return -EINVAL; + + mutex_lock(>lock); + if (imgr->progress == IFPGA_SEC_PROG_PROGRAMMING) + ret = -EBUSY; + else if (imgr->progress == IFPGA_SEC_PROG_IDLE) + ret = -ENODEV; + else + imgr->request_cancel = true; + mutex_unlock(>lock); + + return ret ? : count; +} +static DEVICE_ATTR_WO(cancel); + static struct attribute *sec_mgr_update_attrs[] = { _attr_filename.attr, +
[PATCH v1 05/12] fpga: enable secure updates
Extend the FPGA Intel Security Manager class driver to include an update/filename sysfs node that can be used to initiate a security update. The filename of a secure update file (BMC image, FPGA image, Root Entry Hash image, or Code Signing Key cancellation image) can be written to this sysfs entry to cause a secure update to occur. The write of the filename will return immediately, and the update will begin in the context of a kernel worker thread. This tool utilizes the request_firmware framework, which requires that the image file reside under /lib/firmware. Signed-off-by: Russ Weight --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 13 ++ drivers/fpga/ifpga-sec-mgr.c | 155 ++ include/linux/fpga/ifpga-sec-mgr.h| 49 ++ 3 files changed, 217 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr index 86f8992559bf..a476504b7ae9 100644 --- a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -73,3 +73,16 @@ Contact: Russ Weight Description: Read only. Returns number of times the BMC image has been flashed. Format: "%d". + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/filename +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Write only. Write the filename of an Intel image + file to this sysfs file to initiate a secure + update. The file must have an appropriate header + which, among other things, identifies the target + for the update. This mechanism is used to update + BMC images, BMC firmware, Static Region images, + and Root Entry Hashes, and to cancel Code Signing + Keys (CSK). diff --git a/drivers/fpga/ifpga-sec-mgr.c b/drivers/fpga/ifpga-sec-mgr.c index 97bf80277ed2..73173badbe96 100644 --- a/drivers/fpga/ifpga-sec-mgr.c +++ b/drivers/fpga/ifpga-sec-mgr.c @@ -5,8 +5,11 @@ * Copyright (C) 2019-2020 Intel Corporation, Inc. */ +#include +#include #include #include +#include #include #include #include @@ -14,6 +17,8 @@ static DEFINE_IDA(ifpga_sec_mgr_ida); static struct class *ifpga_sec_mgr_class; +#define WRITE_BLOCK_SIZE 0x4000 + static ssize_t show_canceled_csk(struct ifpga_sec_mgr *imgr, sysfs_csk_hndlr_t get_csk, sysfs_csk_nbits_t get_csk_nbits, @@ -134,6 +139,91 @@ static struct attribute *sec_mgr_security_attrs[] = { NULL, }; +static void ifpga_sec_dev_error(struct ifpga_sec_mgr *imgr, + enum ifpga_sec_err err_code) +{ + imgr->err_code = err_code; + imgr->iops->cancel(imgr); +} + +static void progress_complete(struct ifpga_sec_mgr *imgr) +{ + mutex_lock(>lock); + imgr->progress = IFPGA_SEC_PROG_IDLE; + complete_all(>update_done); + mutex_unlock(>lock); +} + +static void ifpga_sec_mgr_update(struct work_struct *work) +{ + u32 size, blk_size, offset = 0; + struct ifpga_sec_mgr *imgr; + const struct firmware *fw; + enum ifpga_sec_err ret; + + imgr = container_of(work, struct ifpga_sec_mgr, work); + + get_device(>dev); + if (request_firmware(, imgr->filename, >dev)) { + imgr->err_code = IFPGA_SEC_ERR_FILE_READ; + goto idle_exit; + } + + imgr->data = fw->data; + imgr->remaining_size = fw->size; + + if (!try_module_get(imgr->dev.parent->driver->owner)) { + imgr->err_code = IFPGA_SEC_ERR_BUSY; + goto release_fw_exit; + } + + imgr->progress = IFPGA_SEC_PROG_PREPARING; + ret = imgr->iops->prepare(imgr); + if (ret) { + ifpga_sec_dev_error(imgr, ret); + goto modput_exit; + } + + imgr->progress = IFPGA_SEC_PROG_WRITING; + size = imgr->remaining_size; + while (size) { + blk_size = min_t(u32, size, WRITE_BLOCK_SIZE); + size -= blk_size; + ret = imgr->iops->write_blk(imgr, offset, blk_size); + if (ret) { + ifpga_sec_dev_error(imgr, ret); + goto done; + } + + imgr->remaining_size = size; + offset += blk_size; + } + + imgr->progress = IFPGA_SEC_PROG_PROGRAMMING; + ret = imgr->iops->poll_complete(imgr); + if (ret) { + ifpga_sec_dev_error(imgr, ret); + goto done; + } + +done: + if (imgr->iops->cleanup) + imgr->iops->cleanup(imgr); + +modput_exit: + module_put(imgr->dev.parent->driver->owner); + +release_fw_exit: + imgr->data = NULL; + release_firmware(fw); + +idle_exit: + kfree(imgr->filename); + imgr->filename =
[PATCH v1 07/12] fpga: expose sec-mgr update status
Extend the Intel Security Manager class driver to include an update/status sysfs node that can be polled and read to monitor the progress of an ongoing secure update. Sysfs_notify() is used to signal transitions between different phases of the update process. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 11 ++ drivers/fpga/ifpga-sec-mgr.c | 34 --- 2 files changed, 41 insertions(+), 4 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr index a476504b7ae9..849ccb2802f8 100644 --- a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -86,3 +86,14 @@ Description: Write only. Write the filename of an Intel image BMC images, BMC firmware, Static Region images, and Root Entry Hashes, and to cancel Code Signing Keys (CSK). + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/status +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read-only. Returns a string describing the current + status of an update. The string will be one of the + following: idle, read_file, preparing, writing, + programming. Userspace code can poll on this file, + as it will be signaled by sysfs_notify() on each + state change. diff --git a/drivers/fpga/ifpga-sec-mgr.c b/drivers/fpga/ifpga-sec-mgr.c index 73173badbe96..5fe3d85e2963 100644 --- a/drivers/fpga/ifpga-sec-mgr.c +++ b/drivers/fpga/ifpga-sec-mgr.c @@ -139,6 +139,13 @@ static struct attribute *sec_mgr_security_attrs[] = { NULL, }; +static void update_progress(struct ifpga_sec_mgr *imgr, + enum ifpga_sec_prog new_progress) +{ + imgr->progress = new_progress; + sysfs_notify(>dev.kobj, "update", "status"); +} + static void ifpga_sec_dev_error(struct ifpga_sec_mgr *imgr, enum ifpga_sec_err err_code) { @@ -149,7 +156,7 @@ static void ifpga_sec_dev_error(struct ifpga_sec_mgr *imgr, static void progress_complete(struct ifpga_sec_mgr *imgr) { mutex_lock(>lock); - imgr->progress = IFPGA_SEC_PROG_IDLE; + update_progress(imgr, IFPGA_SEC_PROG_IDLE); complete_all(>update_done); mutex_unlock(>lock); } @@ -177,14 +184,14 @@ static void ifpga_sec_mgr_update(struct work_struct *work) goto release_fw_exit; } - imgr->progress = IFPGA_SEC_PROG_PREPARING; + update_progress(imgr, IFPGA_SEC_PROG_PREPARING); ret = imgr->iops->prepare(imgr); if (ret) { ifpga_sec_dev_error(imgr, ret); goto modput_exit; } - imgr->progress = IFPGA_SEC_PROG_WRITING; + update_progress(imgr, IFPGA_SEC_PROG_WRITING); size = imgr->remaining_size; while (size) { blk_size = min_t(u32, size, WRITE_BLOCK_SIZE); @@ -199,7 +206,7 @@ static void ifpga_sec_mgr_update(struct work_struct *work) offset += blk_size; } - imgr->progress = IFPGA_SEC_PROG_PROGRAMMING; + update_progress(imgr, IFPGA_SEC_PROG_PROGRAMMING); ret = imgr->iops->poll_complete(imgr); if (ret) { ifpga_sec_dev_error(imgr, ret); @@ -251,6 +258,24 @@ static struct attribute_group sec_mgr_security_attr_group = { .is_visible = sec_mgr_visible, }; +static const char * const sec_mgr_prog_str[] = { + "idle", /* IFPGA_SEC_PROG_IDLE */ + "read_file",/* IFPGA_SEC_PROG_READ_FILE */ + "preparing",/* IFPGA_SEC_PROG_PREPARING */ + "writing", /* IFPGA_SEC_PROG_WRITING */ + "programming" /* IFPGA_SEC_PROG_PROGRAMMING */ +}; + +static ssize_t +status_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct ifpga_sec_mgr *imgr = to_sec_mgr(dev); + + return sprintf(buf, "%s\n", (imgr->progress < IFPGA_SEC_PROG_MAX) ? + sec_mgr_prog_str[imgr->progress] : "unknown-status"); +} +static DEVICE_ATTR_RO(status); + static ssize_t filename_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { @@ -288,6 +313,7 @@ static DEVICE_ATTR_WO(filename); static struct attribute *sec_mgr_update_attrs[] = { _attr_filename.attr, + _attr_status.attr, NULL, }; -- 2.17.1
[PATCH v1 08/12] fpga: expose sec-mgr update errors
Extend Intel Security Manager class driver to include an update/error sysfs node that can be read for error information when a secure update fails. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- .../ABI/testing/sysfs-class-ifpga-sec-mgr | 17 ++ drivers/fpga/ifpga-sec-mgr.c | 60 +-- include/linux/fpga/ifpga-sec-mgr.h| 1 + 3 files changed, 73 insertions(+), 5 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr index 849ccb2802f8..e7b1b02bf7ee 100644 --- a/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr +++ b/Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr @@ -97,3 +97,20 @@ Description: Read-only. Returns a string describing the current programming. Userspace code can poll on this file, as it will be signaled by sysfs_notify() on each state change. + +What: /sys/class/ifpga_sec_mgr/ifpga_secX/update/error +Date: Sep 2020 +KernelVersion: 5.10 +Contact: Russ Weight +Description: Read-only. Returns a string describing the failure + of a secure update. This string will be in the form + of :, where will be one of + the status strings described for the status sysfs + file and will be one of the following: + hw-error, timeout, user-abort, device-busy, + invalid-file-size, read-write-error, flash-wearout, + file-read-error. The error sysfs file is only + meaningful when the secure update engine is in the + idle state. If this file is read while a secure + update is in progress, then the read will fail with + EBUSY. diff --git a/drivers/fpga/ifpga-sec-mgr.c b/drivers/fpga/ifpga-sec-mgr.c index 5fe3d85e2963..a7718bd8ee61 100644 --- a/drivers/fpga/ifpga-sec-mgr.c +++ b/drivers/fpga/ifpga-sec-mgr.c @@ -146,10 +146,16 @@ static void update_progress(struct ifpga_sec_mgr *imgr, sysfs_notify(>dev.kobj, "update", "status"); } +static void set_error(struct ifpga_sec_mgr *imgr, enum ifpga_sec_err err_code) +{ + imgr->err_state = imgr->progress; + imgr->err_code = err_code; +} + static void ifpga_sec_dev_error(struct ifpga_sec_mgr *imgr, enum ifpga_sec_err err_code) { - imgr->err_code = err_code; + set_error(imgr, err_code); imgr->iops->cancel(imgr); } @@ -172,7 +178,7 @@ static void ifpga_sec_mgr_update(struct work_struct *work) get_device(>dev); if (request_firmware(, imgr->filename, >dev)) { - imgr->err_code = IFPGA_SEC_ERR_FILE_READ; + set_error(imgr, IFPGA_SEC_ERR_FILE_READ); goto idle_exit; } @@ -180,7 +186,7 @@ static void ifpga_sec_mgr_update(struct work_struct *work) imgr->remaining_size = fw->size; if (!try_module_get(imgr->dev.parent->driver->owner)) { - imgr->err_code = IFPGA_SEC_ERR_BUSY; + set_error(imgr, IFPGA_SEC_ERR_BUSY); goto release_fw_exit; } @@ -266,16 +272,59 @@ static const char * const sec_mgr_prog_str[] = { "programming" /* IFPGA_SEC_PROG_PROGRAMMING */ }; +static const char * const sec_mgr_err_str[] = { + "none", /* IFPGA_SEC_ERR_NONE */ + "hw-error", /* IFPGA_SEC_ERR_HW_ERROR */ + "timeout", /* IFPGA_SEC_ERR_TIMEOUT */ + "user-abort", /* IFPGA_SEC_ERR_CANCELED */ + "device-busy", /* IFPGA_SEC_ERR_BUSY */ + "invalid-file-size",/* IFPGA_SEC_ERR_INVALID_SIZE */ + "read-write-error", /* IFPGA_SEC_ERR_RW_ERROR */ + "flash-wearout",/* IFPGA_SEC_ERR_WEAROUT */ + "file-read-error" /* IFPGA_SEC_ERR_FILE_READ */ +}; + +static const char *sec_progress(enum ifpga_sec_prog prog) +{ + return (prog < IFPGA_SEC_PROG_MAX) ? + sec_mgr_prog_str[prog] : "unknown-status"; +} + static ssize_t status_show(struct device *dev, struct device_attribute *attr, char *buf) { struct ifpga_sec_mgr *imgr = to_sec_mgr(dev); - return sprintf(buf, "%s\n", (imgr->progress < IFPGA_SEC_PROG_MAX) ? - sec_mgr_prog_str[imgr->progress] : "unknown-status"); + return sprintf(buf, "%s\n", sec_progress(imgr->progress)); } static DEVICE_ATTR_RO(status); +static ssize_t +error_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct ifpga_sec_mgr *imgr = to_sec_mgr(dev); + enum ifpga_sec_err err_code; + const char *prog_str; + int ret; + + mutex_lock(>lock); + if (imgr->progress != IFPGA_SEC_PROG_IDLE) { + ret = -EBUSY; + } else if (!imgr->err_code) { + ret = 0; + } else { + err_code = imgr->err_code; +
[PATCH v1 06/12] fpga: add max10 secure update functions
Extend the MAX10 BMC Security Engine driver to include the functions that enable secure updates of BMC images, FPGA images, etc. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- drivers/fpga/intel-m10-bmc-secure.c | 272 +++- include/linux/mfd/intel-m10-bmc.h | 101 +++ 2 files changed, 372 insertions(+), 1 deletion(-) diff --git a/drivers/fpga/intel-m10-bmc-secure.c b/drivers/fpga/intel-m10-bmc-secure.c index 46cd49a08be0..4a66c2d448eb 100644 --- a/drivers/fpga/intel-m10-bmc-secure.c +++ b/drivers/fpga/intel-m10-bmc-secure.c @@ -5,6 +5,7 @@ * Copyright (C) 2019-2020 Intel Corporation. All rights reserved. * */ +#include #include #include #include @@ -184,6 +185,271 @@ SYSFS_GET_CSK_VEC(bmc, BMC_PROG_ADDR + CSK_VEC_OFFSET) SYSFS_GET_CSK_VEC(sr, SR_PROG_ADDR + CSK_VEC_OFFSET) SYSFS_GET_CSK_VEC(pr, PR_PROG_ADDR + CSK_VEC_OFFSET) +static void log_error_regs(struct m10bmc_sec *sec, u32 doorbell) +{ + u32 auth_result; + + dev_err(sec->dev, "RSU error status: 0x%08x\n", doorbell); + + if (!m10bmc_sys_read(sec->m10bmc, M10BMC_AUTH_RESULT, _result)) + dev_err(sec->dev, "RSU auth result: 0x%08x\n", auth_result); +} + +static enum ifpga_sec_err rsu_check_idle(struct m10bmc_sec *sec) +{ + u32 doorbell; + int ret; + + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, ); + if (ret) + return IFPGA_SEC_ERR_RW_ERROR; + + if (rsu_prog(doorbell) != RSU_PROG_IDLE && + rsu_prog(doorbell) != RSU_PROG_RSU_DONE) { + log_error_regs(sec, doorbell); + return IFPGA_SEC_ERR_BUSY; + } + + return IFPGA_SEC_ERR_NONE; +} + +static inline bool rsu_start_done(u32 doorbell) +{ + return (!(doorbell & RSU_REQUEST) && + (rsu_stat(doorbell) == RSU_STAT_ERASE_FAIL || + rsu_stat(doorbell) == RSU_STAT_WEAROUT || + (rsu_prog(doorbell) != RSU_PROG_IDLE && +rsu_prog(doorbell) != RSU_PROG_RSU_DONE))); +} + +static enum ifpga_sec_err rsu_update_init(struct m10bmc_sec *sec) +{ + u32 doorbell; + int ret; + + ret = m10bmc_sys_update_bits(sec->m10bmc, M10BMC_DOORBELL, +RSU_REQUEST | HOST_STATUS, RSU_REQUEST | +FIELD_PREP(HOST_STATUS, HOST_STATUS_IDLE)); + if (ret) + return IFPGA_SEC_ERR_RW_ERROR; + + ret = regmap_read_poll_timeout(sec->m10bmc->regmap, + M10BMC_SYS_BASE + M10BMC_DOORBELL, + doorbell, + rsu_start_done(doorbell), + NIOS_HANDSHAKE_INTERVAL_US, + NIOS_HANDSHAKE_TIMEOUT_US); + + if (ret == -ETIMEDOUT) { + log_error_regs(sec, doorbell); + return IFPGA_SEC_ERR_TIMEOUT; + } else if (ret) { + return IFPGA_SEC_ERR_RW_ERROR; + } + + if (rsu_stat(doorbell) == RSU_STAT_WEAROUT) { + dev_warn(sec->dev, "Excessive flash update count detected\n"); + return IFPGA_SEC_ERR_WEAROUT; + } else if (rsu_stat(doorbell) == RSU_STAT_ERASE_FAIL) { + log_error_regs(sec, doorbell); + return IFPGA_SEC_ERR_HW_ERROR; + } + + return IFPGA_SEC_ERR_NONE; +} + +static enum ifpga_sec_err rsu_prog_ready(struct m10bmc_sec *sec) +{ + unsigned long poll_timeout; + u32 doorbell; + int ret; + + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, ); + poll_timeout = jiffies + msecs_to_jiffies(RSU_PREP_TIMEOUT_MS); + while (!ret && !time_after(jiffies, poll_timeout)) { + if (rsu_prog(doorbell) != RSU_PROG_PREPARE) + break; + msleep(RSU_PREP_INTERVAL_MS); + ret = m10bmc_sys_read(sec->m10bmc, M10BMC_DOORBELL, ); + } + + if (ret) { + return IFPGA_SEC_ERR_RW_ERROR; + } else if (rsu_prog(doorbell) == RSU_PROG_PREPARE) { + log_error_regs(sec, doorbell); + return IFPGA_SEC_ERR_TIMEOUT; + } else if (rsu_prog(doorbell) != RSU_PROG_READY) { + log_error_regs(sec, doorbell); + return IFPGA_SEC_ERR_HW_ERROR; + } + + return IFPGA_SEC_ERR_NONE; +} + +static enum ifpga_sec_err rsu_send_data(struct m10bmc_sec *sec) +{ + u32 doorbell; + int ret; + + ret = m10bmc_sys_update_bits(sec->m10bmc, M10BMC_DOORBELL, HOST_STATUS, +FIELD_PREP(HOST_STATUS, + HOST_STATUS_WRITE_DONE)); + if (ret) + return IFPGA_SEC_ERR_RW_ERROR; + + ret = regmap_read_poll_timeout(sec->m10bmc->regmap, + M10BMC_SYS_BASE + M10BMC_DOORBELL, + doorbell, +
[PATCH v1 00/12] Intel FPGA Security Manager Class Driver
These patches depend on the patchset: "add regmap-spi-avmm & Intel Max10 BMC chip support" which is currently under review. -- This patchset introduces the Intel Security Manager class driver for managing secure updates on Intel FPGA Cards. It also provides the n3000bmc-secure mfd sub-driver for the MAX10 BMC for the n3000 Programmable Acceleration Cards (PAC). The n3000bmc-secure driver is implemented using the Intel Security Manager class driver. The Intel Security Manager class driver provides a common API for user-space tools to manage updates for Secure FPGA devices. Device drivers that instantiate the Intel Security Manager class driver will interact with the HW secure update engine in order to transfer new FPGA and BMC images to FLASH so that they will be automatically loaded when the FPGA card reboots. The API consists of sysfs nodes and supports the following functions: (1) Instantiate and monitor a secure update (2) Display security information including: Root Entry Hashes (REH), Cancelled Code Signing Keys (CSK), and flash update counts for both BMC and FPGA images. Secure updates make use of the request_firmware framework, which requires that image files are accessible under /lib/firmware. A request for a secure update returns immediately, while the update itself proceeds in the context of a kernel worker thread. Sysfs files provide a means for monitoring the progress of a secure update and for retrieving error information in the event of a failure. The n3000bmc-secure driver instantiates the Intel Security Manager class driver and provides the callback functions required to support secure updates on Intel n3000 PAC devices. Russ Weight (12): fpga: fpga security manager class driver fpga: create intel max10 bmc security engine fpga: expose max10 flash update counts in sysfs fpga: expose max10 canceled keys in sysfs fpga: enable secure updates fpga: add max10 secure update functions fpga: expose sec-mgr update status fpga: expose sec-mgr update errors fpga: expose sec-mgr update size fpga: enable sec-mgr update cancel fpga: expose hardware error info in sysfs fpga: add max10 get_hw_errinfo callback func .../ABI/testing/sysfs-class-ifpga-sec-mgr | 151 MAINTAINERS | 8 + drivers/fpga/Kconfig | 20 + drivers/fpga/Makefile | 6 + drivers/fpga/ifpga-sec-mgr.c | 669 ++ drivers/fpga/intel-m10-bmc-secure.c | 557 +++ include/linux/fpga/ifpga-sec-mgr.h| 201 ++ include/linux/mfd/intel-m10-bmc.h | 116 +++ 8 files changed, 1728 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-ifpga-sec-mgr create mode 100644 drivers/fpga/ifpga-sec-mgr.c create mode 100644 drivers/fpga/intel-m10-bmc-secure.c create mode 100644 include/linux/fpga/ifpga-sec-mgr.h -- 2.17.1
[PATCH v1 02/12] fpga: create intel max10 bmc security engine
Create a platform driver that can be invoked as a sub driver for the Intel MAX10 BMC in order to support secure updates. This sub-driver will invoke an instance of the Intel FPGA Security Manager class driver in order to expose sysfs interfaces for managing and monitoring secure updates to FPGA and BMC images. This patch creates the MAX10 BMC Security Engine driver and provides support for displaying the current root entry hashes for the FPGA static region, the FPGA PR region, and the MAX10 BMC. Signed-off-by: Russ Weight Reviewed-by: Wu Hao --- drivers/fpga/Kconfig| 11 ++ drivers/fpga/Makefile | 3 + drivers/fpga/intel-m10-bmc-secure.c | 170 include/linux/mfd/intel-m10-bmc.h | 15 +++ 4 files changed, 199 insertions(+) create mode 100644 drivers/fpga/intel-m10-bmc-secure.c diff --git a/drivers/fpga/Kconfig b/drivers/fpga/Kconfig index 97c0a6cc2ba7..0f0bed68e618 100644 --- a/drivers/fpga/Kconfig +++ b/drivers/fpga/Kconfig @@ -244,4 +244,15 @@ config IFPGA_SEC_MGR region and for the BMC. Select this option to enable updates for secure FPGA devices. +config IFPGA_M10_BMC_SECURE +tristate "Intel MAX10 BMC security engine" + depends on MFD_INTEL_M10_BMC && IFPGA_SEC_MGR +help + Secure update support for the Intel MAX10 board management + controller. + + This is a subdriver of the Intel MAX10 board management controller + (BMC) and provides support for secure updates for the BMC image, + the FPGA image, the Root Entry Hashes, etc. + endif # FPGA diff --git a/drivers/fpga/Makefile b/drivers/fpga/Makefile index ec9fbacdedd8..451a23ec3168 100644 --- a/drivers/fpga/Makefile +++ b/drivers/fpga/Makefile @@ -24,6 +24,9 @@ obj-$(CONFIG_ALTERA_PR_IP_CORE_PLAT)+= altera-pr-ip-core-plat.o # Intel FPGA Security Manager Framework obj-$(CONFIG_IFPGA_SEC_MGR)+= ifpga-sec-mgr.o +# Intel Security Manager Drivers +obj-$(CONFIG_IFPGA_M10_BMC_SECURE) += intel-m10-bmc-secure.o + # FPGA Bridge Drivers obj-$(CONFIG_FPGA_BRIDGE) += fpga-bridge.o obj-$(CONFIG_SOCFPGA_FPGA_BRIDGE) += altera-hps2fpga.o altera-fpga2sdram.o diff --git a/drivers/fpga/intel-m10-bmc-secure.c b/drivers/fpga/intel-m10-bmc-secure.c new file mode 100644 index ..1f86bfb694b4 --- /dev/null +++ b/drivers/fpga/intel-m10-bmc-secure.c @@ -0,0 +1,170 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Intel Max10 Board Management Controller Security Engine Driver + * + * Copyright (C) 2019-2020 Intel Corporation. All rights reserved. + * + */ +#include +#include +#include +#include +#include +#include + +struct m10bmc_sec { + struct device *dev; + struct intel_m10bmc *m10bmc; + struct ifpga_sec_mgr *imgr; +}; + +#define SHA256_REH_SIZE32 +#define SHA384_REH_SIZE48 + +static int get_reh_size(struct ifpga_sec_mgr *imgr, + u32 exp_magic, u32 prog_addr) +{ + struct m10bmc_sec *sec = imgr->priv; + int sha_num_bytes, ret; + u32 magic; + + ret = m10bmc_raw_read(sec->m10bmc, prog_addr, ); + if (ret) + return ret; + + dev_dbg(sec->dev, "%s magic 0x%08x\n", __func__, magic); + + if ((magic & 0x) != exp_magic) + return 0; + + sha_num_bytes = ((magic >> 16) & 0x) / 8; + + if (sha_num_bytes != SHA256_REH_SIZE && + sha_num_bytes != SHA384_REH_SIZE) { + dev_err(sec->dev, "%s bad sha num bytes %d\n", __func__, + sha_num_bytes); + return -EINVAL; + } + + return sha_num_bytes; +} + +#define BMC_REH_ADDR 0x17ffc004 +#define BMC_PROG_ADDR 0x17ffc000 +#define BMC_PROG_MAGIC 0x5746 + +#define SR_REH_ADDR 0x17ffd004 +#define SR_PROG_ADDR 0x17ffd000 +#define SR_PROG_MAGIC 0x5253 + +#define PR_REH_ADDR 0x17ffe004 +#define PR_PROG_ADDR 0x17ffe000 +#define PR_PROG_MAGIC 0x5250 + +#define SYSFS_GET_REH_SIZE(_name, _exp_magic, _prog_addr) \ +static int get_##_name##_reh_size(struct ifpga_sec_mgr *imgr) \ +{ \ + return get_reh_size(imgr, _exp_magic, _prog_addr); \ +} + +SYSFS_GET_REH_SIZE(bmc, BMC_PROG_MAGIC, BMC_PROG_ADDR) +SYSFS_GET_REH_SIZE(sr, SR_PROG_MAGIC, SR_PROG_ADDR) +SYSFS_GET_REH_SIZE(pr, PR_PROG_MAGIC, PR_PROG_ADDR) + +static int get_root_entry_hash(struct ifpga_sec_mgr *imgr, + u32 hash_addr, u8 *hash, + unsigned int size) +{ + struct m10bmc_sec *sec = imgr->priv; + unsigned int stride = regmap_get_reg_stride(sec->m10bmc->regmap); + int ret; + + ret = m10bmc_raw_bulk_read(sec->m10bmc, hash_addr, + hash, size / stride); + if (ret) + dev_err(sec->dev, "bulk_read of 0x%x failed %d", + hash_addr, ret); + + return ret; +} + +#define SYSFS_GET_REH(_name, _hash_addr) \ +static
[PATCH 2/4] arm64: dts: ti: k3-j7200-main: add main navss cpts node
Add DT node for Main NAVSS CPTS module. Signed-off-by: Grygorii Strashko --- arch/arm64/boot/dts/ti/k3-j7200-main.dtsi | 12 1 file changed, 12 insertions(+) diff --git a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi index cc4ff380a7bc..822c062e25c8 100644 --- a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi @@ -117,6 +117,18 @@ <0x0c>; /* RX_UHCHAN */ ti,sci-rm-range-rflow = <0x00>; /* GP RFLOW */ }; + + cpts@310d { + compatible = "ti,j721e-cpts"; + reg = <0x0 0x310d 0x0 0x400>; + reg-names = "cpts"; + clocks = <_clks 201 1>; + clock-names = "cpts"; + interrupts-extended = <_navss_intr 391>; + interrupt-names = "cpts"; + ti,cpts-periodic-outputs = <6>; + ti,cpts-ext-ts-inputs = <8>; + }; }; main_pmx0: pinmux@11c000 { -- 2.17.1
[PATCH 0/4] arm64: dts: ti: k3-j7200: add dma and mcu cpsw
Hi All, arm64: dts: ti: k3-j7200: add dma and mcu cpsw nodes This series adds DT nodes for TI J7200 SoC - INTR/INTA, Ringacc and UDMA nodes for Main and MCU NAVSS, which are compatible with J721E Soc, to enable DMA support - MCU CPSW2g DT nodes to enable networking This series depends on: - [PATCH v2 0/4] arm64: Initial support for Texas Instrument's J7200 Platform [1] from: Lokesh Vutla - [PATCH] soc: ti: k3-socinfo: Add entry for J7200 [2] from: Peter Ujfalusi [1] https://lore.kernel.org/linux-arm-kernel/20200827065144.17683-1-lokeshvu...@ti.com/T/#m141ae4d0dd818518c00c81806d689983d6e832e6 [2] https://lore.kernel.org/patchwork/patch/1283230/ Grygorii Strashko (3): arm64: dts: ti: k3-j7200-main: add main navss cpts node arm64: dts: ti: k3-j7200-mcu: add mcu cpsw nuss node arm64: dts: ti: k3-j7200-common-proc-board: add mcu cpsw nuss pinmux and phy defs Peter Ujfalusi (1): arm64: dts: ti: k3-j7200: add DMA support .../dts/ti/k3-j7200-common-proc-board.dts | 45 +++ arch/arm64/boot/dts/ti/k3-j7200-main.dtsi | 73 +++ .../boot/dts/ti/k3-j7200-mcu-wakeup.dtsi | 118 ++ 3 files changed, 236 insertions(+) -- 2.17.1
[PATCH 4/4] arm64: dts: ti: k3-j7200-common-proc-board: add mcu cpsw nuss pinmux and phy defs
The TI j7200 EVM base board has TI DP83867 PHY connected to external CPSW NUSS Port 1 in rgmii-rxid mode. Hence, add pinmux and Ethernet PHY configuration for TI j7200 SoC MCU Gigabit Ethernet two ports Switch subsystem (CPSW NUSS). Signed-off-by: Grygorii Strashko --- .../dts/ti/k3-j7200-common-proc-board.dts | 45 +++ 1 file changed, 45 insertions(+) diff --git a/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts b/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts index e27069317c4e..52bde66930d1 100644 --- a/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts +++ b/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts @@ -6,6 +6,7 @@ /dts-v1/; #include "k3-j7200-som-p0.dtsi" +#include / { chosen { @@ -14,6 +15,32 @@ }; }; +_pmx0 { + mcu_cpsw_pins_default: mcu_cpsw_pins_default { + pinctrl-single,pins = < + J721E_WKUP_IOPAD(0x0068, PIN_OUTPUT, 0) /* MCU_RGMII1_TX_CTL */ + J721E_WKUP_IOPAD(0x006c, PIN_INPUT, 0) /* MCU_RGMII1_RX_CTL */ + J721E_WKUP_IOPAD(0x0070, PIN_OUTPUT, 0) /* MCU_RGMII1_TD3 */ + J721E_WKUP_IOPAD(0x0074, PIN_OUTPUT, 0) /* MCU_RGMII1_TD2 */ + J721E_WKUP_IOPAD(0x0078, PIN_OUTPUT, 0) /* MCU_RGMII1_TD1 */ + J721E_WKUP_IOPAD(0x007c, PIN_OUTPUT, 0) /* MCU_RGMII1_TD0 */ + J721E_WKUP_IOPAD(0x0088, PIN_INPUT, 0) /* MCU_RGMII1_RD3 */ + J721E_WKUP_IOPAD(0x008c, PIN_INPUT, 0) /* MCU_RGMII1_RD2 */ + J721E_WKUP_IOPAD(0x0090, PIN_INPUT, 0) /* MCU_RGMII1_RD1 */ + J721E_WKUP_IOPAD(0x0094, PIN_INPUT, 0) /* MCU_RGMII1_RD0 */ + J721E_WKUP_IOPAD(0x0080, PIN_INPUT, 0) /* MCU_RGMII1_TXC */ + J721E_WKUP_IOPAD(0x0084, PIN_INPUT, 0) /* MCU_RGMII1_RXC */ + >; + }; + + mcu_mdio_pins_default: mcu_mdio1_pins_default { + pinctrl-single,pins = < + J721E_WKUP_IOPAD(0x009c, PIN_OUTPUT, 0) /* (L1) MCU_MDIO0_MDC */ + J721E_WKUP_IOPAD(0x0098, PIN_INPUT, 0) /* (L4) MCU_MDIO0_MDIO */ + >; + }; +}; + _uart0 { /* Wakeup UART is used by System firmware */ status = "disabled"; @@ -62,3 +89,21 @@ /* UART not brought out */ status = "disabled"; }; + +_cpsw { + pinctrl-names = "default"; + pinctrl-0 = <_cpsw_pins_default _mdio_pins_default>; +}; + +_mdio { + phy0: ethernet-phy@0 { + reg = <0>; + ti,rx-internal-delay = ; + ti,fifo-depth = ; + }; +}; + +_port1 { + phy-mode = "rgmii-rxid"; + phy-handle = <>; +}; -- 2.17.1
[PATCH 1/4] arm64: dts: ti: k3-j7200: add DMA support
From: Peter Ujfalusi Add the intr, inta, ringacc and udmap nodes for main and mcu NAVSS. Signed-off-by: Peter Ujfalusi Signed-off-by: Grygorii Strashko --- arch/arm64/boot/dts/ti/k3-j7200-main.dtsi | 61 +++ .../boot/dts/ti/k3-j7200-mcu-wakeup.dtsi | 44 + 2 files changed, 105 insertions(+) diff --git a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi index 70c8f7e941fb..cc4ff380a7bc 100644 --- a/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j7200-main.dtsi @@ -45,6 +45,31 @@ #address-cells = <2>; #size-cells = <2>; ranges = <0x00 0x3000 0x00 0x3000 0x00 0x0c40>; + ti,sci-dev-id = <199>; + + main_navss_intr: interrupt-controller1 { + compatible = "ti,sci-intr"; + ti,intr-trigger-type = <4>; + interrupt-controller; + interrupt-parent = <>; + #interrupt-cells = <1>; + ti,sci = <>; + ti,sci-dev-id = <213>; + ti,interrupt-ranges = <0 64 64>, + <64 448 64>, + <128 672 64>; + }; + + main_udmass_inta: interrupt-controller@33d0 { + compatible = "ti,sci-inta"; + reg = <0x0 0x33d0 0x0 0x10>; + interrupt-controller; + interrupt-parent = <_navss_intr>; + msi-controller; + ti,sci = <>; + ti,sci-dev-id = <209>; + ti,interrupt-ranges = <0 0 256>; + }; secure_proxy_main: mailbox@32c0 { compatible = "ti,am654-secure-proxy"; @@ -56,6 +81,42 @@ interrupt-names = "rx_011"; interrupts = ; }; + + main_ringacc: ringacc@3c00 { + compatible = "ti,am654-navss-ringacc"; + reg = <0x0 0x3c00 0x0 0x40>, + <0x0 0x3800 0x0 0x40>, + <0x0 0x3112 0x0 0x100>, + <0x0 0x3300 0x0 0x4>; + reg-names = "rt", "fifos", "proxy_gcfg", "proxy_target"; + ti,num-rings = <1024>; + ti,sci-rm-range-gp-rings = <0x1>; /* GP ring range */ + ti,sci = <>; + ti,sci-dev-id = <211>; + msi-parent = <_udmass_inta>; + }; + + main_udmap: dma-controller@3115 { + compatible = "ti,j721e-navss-main-udmap"; + reg = <0x0 0x3115 0x0 0x100>, + <0x0 0x3400 0x0 0x10>, + <0x0 0x3500 0x0 0x10>; + reg-names = "gcfg", "rchanrt", "tchanrt"; + msi-parent = <_udmass_inta>; + #dma-cells = <1>; + + ti,sci = <>; + ti,sci-dev-id = <212>; + ti,ringacc = <_ringacc>; + + ti,sci-rm-range-tchan = <0x0d>, /* TX_CHAN */ + <0x0f>, /* TX_HCHAN */ + <0x10>; /* TX_UHCHAN */ + ti,sci-rm-range-rchan = <0x0a>, /* RX_CHAN */ + <0x0b>, /* RX_HCHAN */ + <0x0c>; /* RX_UHCHAN */ + ti,sci-rm-range-rflow = <0x00>; /* GP RFLOW */ + }; }; main_pmx0: pinmux@11c000 { diff --git a/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi b/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi index 670e4c7cd9fe..9ecb7e0c9cf7 100644 --- a/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi @@ -81,4 +81,48 @@ clocks = <_clks 149 2>; clock-names = "fclk"; }; + + cbass_mcu_navss: navss@2838 { + compatible = "simple-mfd"; + #address-cells = <2>; + #size-cells = <2>; + ranges; + dma-coherent; + dma-ranges; + ti,sci-dev-id = <232>; + + mcu_ringacc: ringacc@2b80 { + compatible = "ti,am654-navss-ringacc"; + reg = <0x0 0x2b80 0x0 0x40>, + <0x0 0x2b00 0x0 0x40>, + <0x0 0x2859 0x0 0x100>, + <0x0 0x2a50 0x0 0x4>;
[PATCH 3/4] arm64: dts: ti: k3-j7200-mcu: add mcu cpsw nuss node
Add DT node for The TI j7200 MCU SoC Gigabit Ethernet two ports Switch subsystem (MCU CPSW NUSS). Signed-off-by: Grygorii Strashko --- .../boot/dts/ti/k3-j7200-mcu-wakeup.dtsi | 74 +++ 1 file changed, 74 insertions(+) diff --git a/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi b/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi index 9ecb7e0c9cf7..06cd6a80a499 100644 --- a/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi +++ b/arch/arm64/boot/dts/ti/k3-j7200-mcu-wakeup.dtsi @@ -34,6 +34,20 @@ }; }; + mcu_conf: syscon@40f0 { + compatible = "syscon", "simple-mfd"; + reg = <0x0 0x40f0 0x0 0x2>; + #address-cells = <1>; + #size-cells = <1>; + ranges = <0x0 0x0 0x40f0 0x2>; + + phy_gmii_sel: phy@4040 { + compatible = "ti,am654-phy-gmii-sel"; + reg = <0x4040 0x4>; + #phy-cells = <1>; + }; + }; + chipid@4314 { compatible = "ti,am654-chipid"; reg = <0x0 0x4314 0x0 0x4>; @@ -125,4 +139,64 @@ ti,sci-rm-range-rflow = <0x00>; /* GP RFLOW */ }; }; + + mcu_cpsw: ethernet@4600 { + compatible = "ti,j721e-cpsw-nuss"; + #address-cells = <2>; + #size-cells = <2>; + reg = <0x0 0x4600 0x0 0x20>; + reg-names = "cpsw_nuss"; + ranges = <0x0 0x0 0x0 0x4600 0x0 0x20>; + dma-coherent; + clocks = <_clks 18 21>; + clock-names = "fck"; + power-domains = <_pds 18 TI_SCI_PD_EXCLUSIVE>; + + dmas = <_udmap 0xf000>, + <_udmap 0xf001>, + <_udmap 0xf002>, + <_udmap 0xf003>, + <_udmap 0xf004>, + <_udmap 0xf005>, + <_udmap 0xf006>, + <_udmap 0xf007>, + <_udmap 0x7000>; + dma-names = "tx0", "tx1", "tx2", "tx3", + "tx4", "tx5", "tx6", "tx7", + "rx"; + + ethernet-ports { + #address-cells = <1>; + #size-cells = <0>; + + cpsw_port1: port@1 { + reg = <1>; + ti,mac-only; + label = "port1"; + ti,syscon-efuse = <_conf 0x200>; + phys = <_gmii_sel 1>; + }; + }; + + davinci_mdio: mdio@f00 { + compatible = "ti,cpsw-mdio","ti,davinci_mdio"; + reg = <0x0 0xf00 0x0 0x100>; + #address-cells = <1>; + #size-cells = <0>; + clocks = <_clks 18 21>; + clock-names = "fck"; + bus_freq = <100>; + }; + + cpts@3d000 { + compatible = "ti,am65-cpts"; + reg = <0x0 0x3d000 0x0 0x400>; + clocks = <_clks 18 2>; + clock-names = "cpts"; + interrupts-extended = < GIC_SPI 858 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "cpts"; + ti,cpts-ext-ts-inputs = <4>; + ti,cpts-periodic-outputs = <2>; + }; + }; }; -- 2.17.1
[tip: x86/cleanups] x86/resctrl: Fix spelling in user-visible warning messages
The following commit has been merged into the x86/cleanups branch of tip: Commit-ID: 93921baa3f6ff77e57d7e772165aa7bd709b5387 Gitweb: https://git.kernel.org/tip/93921baa3f6ff77e57d7e772165aa7bd709b5387 Author:Colin Ian King AuthorDate:Mon, 10 Aug 2020 08:55:08 +01:00 Committer: Borislav Petkov CommitterDate: Sat, 05 Sep 2020 01:24:17 +02:00 x86/resctrl: Fix spelling in user-visible warning messages Fix spelling mistake "Could't" -> "Couldn't" in user-visible warning messages. [ bp: Massage commit message; s/cpu/CPU/g ] Signed-off-by: Colin Ian King Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20200810075508.46490-1-colin.k...@canonical.com --- arch/x86/kernel/cpu/resctrl/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 6a9df71..9cceee6 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -562,7 +562,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) d = rdt_find_domain(r, id, _pos); if (IS_ERR(d)) { - pr_warn("Could't find cache id for cpu %d\n", cpu); + pr_warn("Couldn't find cache id for CPU %d\n", cpu); return; } @@ -607,7 +607,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) d = rdt_find_domain(r, id, NULL); if (IS_ERR_OR_NULL(d)) { - pr_warn("Could't find cache id for cpu %d\n", cpu); + pr_warn("Couldn't find cache id for CPU %d\n", cpu); return; }
[PATCH v3 1/2] scsi: ibmvfc: use compiler attribute defines instead of __attribute__()
Update ibmvfc.h structs to use the preferred __packed and __aligned() attribute macros defined in include/linux/compiler_attributes.h in place of __attribute__(). Signed-off-by: Tyrel Datwyler --- drivers/scsi/ibmvscsi/ibmvfc.h | 56 +- 1 file changed, 28 insertions(+), 28 deletions(-) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h index 907889f1fa9d..6da23666f5be 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.h +++ b/drivers/scsi/ibmvscsi/ibmvfc.h @@ -133,16 +133,16 @@ struct ibmvfc_mad_common { __be16 status; __be16 length; __be64 tag; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_npiv_login_mad { struct ibmvfc_mad_common common; struct srp_direct_buf buffer; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_npiv_logout_mad { struct ibmvfc_mad_common common; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); #define IBMVFC_MAX_NAME 256 @@ -168,7 +168,7 @@ struct ibmvfc_npiv_login { u8 device_name[IBMVFC_MAX_NAME]; u8 drc_name[IBMVFC_MAX_NAME]; __be64 reserved2[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_common_svc_parms { __be16 fcph_version; @@ -177,7 +177,7 @@ struct ibmvfc_common_svc_parms { __be16 bb_rcv_sz; /* upper nibble is BB_SC_N */ __be32 ratov; __be32 edtov; -}__attribute__((packed, aligned (4))); +} __packed __aligned(4); struct ibmvfc_service_parms { struct ibmvfc_common_svc_parms common; @@ -192,7 +192,7 @@ struct ibmvfc_service_parms { __be32 ext_len; __be32 reserved[30]; __be32 clk_sync_qos[2]; -}__attribute__((packed, aligned (4))); +} __packed __aligned(4); struct ibmvfc_npiv_login_resp { __be32 version; @@ -217,12 +217,12 @@ struct ibmvfc_npiv_login_resp { u8 drc_name[IBMVFC_MAX_NAME]; struct ibmvfc_service_parms service_parms; __be64 reserved2; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); union ibmvfc_npiv_login_data { struct ibmvfc_npiv_login login; struct ibmvfc_npiv_login_resp resp; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_discover_targets_buf { __be32 scsi_id[1]; @@ -239,7 +239,7 @@ struct ibmvfc_discover_targets { __be32 num_avail; __be32 num_written; __be64 reserved[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); enum ibmvfc_fc_reason { IBMVFC_INVALID_ELS_CMD_CODE = 0x01, @@ -283,7 +283,7 @@ struct ibmvfc_port_login { struct ibmvfc_service_parms service_parms; struct ibmvfc_service_parms service_parms_change; __be64 reserved3[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_prli_svc_parms { u8 type; @@ -303,7 +303,7 @@ struct ibmvfc_prli_svc_parms { #define IBMVFC_PRLI_TARGET_FUNC0x0010 #define IBMVFC_PRLI_READ_FCP_XFER_RDY_DISABLED 0x0002 #define IBMVFC_PRLI_WR_FCP_XFER_RDY_DISABLED 0x0001 -}__attribute__((packed, aligned (4))); +} __packed __aligned(4); struct ibmvfc_process_login { struct ibmvfc_mad_common common; @@ -314,7 +314,7 @@ struct ibmvfc_process_login { __be16 error; /* also fc_reason */ __be32 reserved2; __be64 reserved3[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_query_tgt { struct ibmvfc_mad_common common; @@ -325,13 +325,13 @@ struct ibmvfc_query_tgt { __be16 fc_explain; __be16 fc_type; __be64 reserved[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_implicit_logout { struct ibmvfc_mad_common common; __be64 old_scsi_id; __be64 reserved[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_tmf { struct ibmvfc_mad_common common; @@ -348,7 +348,7 @@ struct ibmvfc_tmf { __be32 my_cancel_key; __be32 pad; __be64 reserved[2]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); enum ibmvfc_fcp_rsp_info_codes { RSP_NO_FAILURE = 0x00, @@ -361,7 +361,7 @@ struct ibmvfc_fcp_rsp_info { u8 reserved[3]; u8 rsp_code; u8 reserved2[4]; -}__attribute__((packed, aligned (2))); +} __packed __aligned(2); enum ibmvfc_fcp_rsp_flags { FCP_BIDI_RSP= 0x80, @@ -377,7 +377,7 @@ enum ibmvfc_fcp_rsp_flags { union ibmvfc_fcp_rsp_data { struct ibmvfc_fcp_rsp_info info; u8 sense[SCSI_SENSE_BUFFERSIZE + sizeof(struct ibmvfc_fcp_rsp_info)]; -}__attribute__((packed, aligned (8))); +} __packed __aligned(8); struct ibmvfc_fcp_rsp { __be64 reserved; @@ -388,7 +388,7 @@ struct ibmvfc_fcp_rsp { __be32
[PATCH v3 2/2] scsi: ibmvfc: interface updates for future FPIN and MQ support
VIOS partitions with SLI-4 enabled Emulex adapters will be capable of driving IO in parallel through mulitple work queues or channels, and with new hyperviosr firmware that supports multiple interrupt sources an ibmvfc NPIV single initiator can be modified to exploit end to end channelization in a PowerVM environment. VIOS hosts will also be able to expose fabric perfromance impact notifications (FPIN) via a new asynchronous event to ibmvfc clients that advertise support via IBMVFC_CAN_HANDLE_FPIN in their capabilities flag during NPIV_LOGIN. This patch introduces three new Management Datagrams (MADs) for channelization support negotiation as well as the FPIN asynchronous event and FPIN status flags. Follow up work is required to plumb the ibmvfc client driver to use these new interfaces. Signed-off-by: Tyrel Datwyler --- v2 -> v3: Fixup checkpatch warnings about using __attribute__() v1 -> v2: Fixup complier errors from neglected commit --amend --- drivers/scsi/ibmvscsi/ibmvfc.h | 66 +- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h index 6da23666f5be..e6e1c255a79c 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.h +++ b/drivers/scsi/ibmvscsi/ibmvfc.h @@ -124,6 +124,9 @@ enum ibmvfc_mad_types { IBMVFC_PASSTHRU = 0x0200, IBMVFC_TMF_MAD = 0x0100, IBMVFC_NPIV_LOGOUT = 0x0800, + IBMVFC_CHANNEL_ENQUIRY = 0x1000, + IBMVFC_CHANNEL_SETUP= 0x2000, + IBMVFC_CONNECTION_INFO = 0x4000, }; struct ibmvfc_mad_common { @@ -162,6 +165,8 @@ struct ibmvfc_npiv_login { __be32 max_cmds; __be64 capabilities; #define IBMVFC_CAN_MIGRATE 0x01 +#define IBMVFC_CAN_USE_CHANNELS0x02 +#define IBMVFC_CAN_HANDLE_FPIN 0x04 __be64 node_name; struct srp_direct_buf async; u8 partition_name[IBMVFC_MAX_NAME]; @@ -204,6 +209,7 @@ struct ibmvfc_npiv_login_resp { __be64 capabilities; #define IBMVFC_CAN_FLUSH_ON_HALT 0x08 #define IBMVFC_CAN_SUPPRESS_ABTS 0x10 +#define IBMVFC_CAN_SUPPORT_CHANNELS0x20 __be32 max_cmds; __be32 scsi_id_sz; __be64 max_dma_len; @@ -482,6 +488,52 @@ struct ibmvfc_passthru_mad { struct ibmvfc_passthru_fc_iu fc_iu; } __packed __aligned(8); +struct ibmvfc_channel_enquiry { + struct ibmvfc_mad_common common; + __be32 flags; +#define IBMVFC_NO_CHANNELS_TO_CRQ_SUPPORT 0x01 +#define IBMVFC_SUPPORT_VARIABLE_SUBQ_MSG 0x02 +#define IBMVFC_NO_N_TO_M_CHANNELS_SUPPORT 0x04 + __be32 num_scsi_subq_channels; + __be32 num_nvmeof_subq_channels; + __be32 num_scsi_vas_channels; + __be32 num_nvmeof_vas_channels; +} __packed __aligned(8); + +struct ibmvfc_channel_setup_mad { + struct ibmvfc_mad_common common; + struct srp_direct_buf buffer; +} __packed __aligned(8); + +#define IBMVFC_MAX_CHANNELS502 + +struct ibmvfc_channel_setup { + __be32 flags; +#define IBMVFC_CANCEL_CHANNELS 0x01 +#define IBMVFC_USE_BUFFER 0x02 +#define IBMVFC_CHANNELS_CANCELED 0x04 + __be32 reserved; + __be32 num_scsi_subq_channels; + __be32 num_nvmeof_subq_channels; + __be32 num_scsi_vas_channels; + __be32 num_nvmeof_vas_channels; + struct srp_direct_buf buffer; + __be64 reserved2[5]; + __be64 channel_handles[IBMVFC_MAX_CHANNELS]; +} __packed __aligned(8); + +struct ibmvfc_connection_info { + struct ibmvfc_mad_common common; + __be64 information_bits; +#define IBMVFC_NO_FC_IO_CHANNEL0x01 +#define IBMVFC_NO_PHYP_VAS 0x02 +#define IBMVFC_NO_PHYP_SUBQ0x04 +#define IBMVFC_PHYP_DEPRECATED_SUBQ0x08 +#define IBMVFC_PHYP_PRESERVED_SUBQ 0x10 +#define IBMVFC_PHYP_FULL_SUBQ 0x20 + __be64 reserved[16]; +} __packed __aligned(8); + struct ibmvfc_trace_start_entry { u32 xfer_len; } __packed; @@ -532,6 +584,7 @@ enum ibmvfc_async_event { IBMVFC_AE_HALT = 0x0400, IBMVFC_AE_RESUME= 0x0800, IBMVFC_AE_ADAPTER_FAILED= 0x1000, + IBMVFC_AE_FPIN = 0x2000, }; struct ibmvfc_async_desc { @@ -560,10 +613,18 @@ enum ibmvfc_ae_link_state { IBMVFC_AE_LS_LINK_DEAD = 0x08, }; +enum ibmvfc_ae_fpin_status { + IBMVFC_AE_FPIN_LINK_CONGESTED = 0x1, + IBMVFC_AE_FPIN_PORT_CONGESTED = 0x2, + IBMVFC_AE_FPIN_PORT_CLEARED = 0x3, + IBMVFC_AE_FPIN_PORT_DEGRADED= 0x4, +}; + struct ibmvfc_async_crq { volatile u8 valid; u8 link_state; - u8 pad[2]; + u8 fpin_status; + u8 pad; __be32 pad2; volatile __be64 event; volatile __be64 scsi_id; @@ -590,6 +651,9 @@ union ibmvfc_iu { struct ibmvfc_tmf tmf; struct ibmvfc_cmd cmd;
Re: printk: Add process name information to printk() output.
On 2020-09-04, Petr Mladek wrote: >>> I am currently playing with support for all three timestamps based >>> on https://lore.kernel.org/lkml/20200814101933.574326...@linutronix.de/ >>> >>> And I got the following idea: >>> >>> 1. Storing side: >>> >>>Create one more ring/array for storing the optional metadata. >>>It might eventually replace dict ring, see below. >>> >>>struct struct printk_ext_info { >>> u64 ts_boot;/* timestamp from boot clock */ >>> u64 ts_real;/* timestamp from real clock */ >>> char process[TASK_COMM_LEN];/* process name */ >>>}; >>> >>>It must be in a separate array so that struct prb_desc stay stable >>>and crashdump tools do not need to be updated so often. >>> >>>But the number of these structures must be the same as descriptors. >>>So it might be: >>> >>>struct prb_desc_ring { >>> unsigned intcount_bits; >>> struct prb_desc *descs; >>> struct printk_ext_info *ext_info >>> atomic_long_t head_id; >>> atomic_long_t tail_id; >>>}; >>> >>>One huge advantage is that these extra information would not block >>>pushing lockless printk buffer upstream. >>> >>>It might be even possible to get rid of dict ring and just >>>add two more elements into struct printk_ext_info: >>> >>> char subsystem[16]; /* for SUBSYSTEM= dict value */ >>> char device[48]; /* for DEVICE= dict value */ > > From my POV, if we support 3 timestamps then they must be stored > reliably. And dict ring is out of the game. Agreed. I am just trying to think of how to better manage the strings, which currently are rare and optional. That is where the dict_ring becomes interesting. Perhaps we should use both the fixed structs with the variable dict_ring. printk_ext_info could look like this: struct struct printk_ext_info { u64 ts_boot; u64 ts_real; char *process; char *subsystem; char *device; }; And @process, @subsystem, @device could all point to null-terminated trings within the dict_ring. So printk.c code looks something like this: size_t process_sz = strlen(process) + 1; size_t subsystem_sz = strlen(subsystem) + 1; size_t device_sz = strlen(device) + 1; struct prb_reserved_entry e; struct printk_record r; char *p; prb_rec_init_wr(, text_len, process_sz + subsystem_sz + device_sz); prb_reserve(, prb, ); memcpy(r.text_buf, text, text_len); r.info->text_len = text_len; /* guaranteed ext data */ r.ext_info->ts_boot = time_boot(); r.ext_info->ts_real = time_real(); /* optional ext data */ if (r.dict_buf) { p = r.dict_buf; memcpy(p, process, process_sz); r.ext_info->process = p; p += process_sz; memcpy(p, subsystem, subsystem_sz); r.ext_info->subsystem = p; p += subsystem_sz; memcpy(p, device, device_sz); r.ext_info->device = p; r.info->dict_len = process_sz + subsystem_sz + device_sz; } > And I am not comfortable even with the current dictionary handling. > I already wrote this somewhere. The following command is supposed > to show all kernel messages printed by "pci" subsystem: > > $> journalctl _KERNEL_SUBSYSTEM=pci > > It will be incomplete when the dictionary metadata were not saved. In that case, perhaps @subsystem should be a static array in printk_ext_info instead. > Regarding the waste of space. The dict ring currently has the same > size as the text ring. It is likely a waste of space as well. Any > tuning is complicated because it depends on the use case. The whole point of the dict_ring is that it allows for variable length _optional_ data to be stored. If we decide there is no optional data, then dict_ring is not needed. > The advantage of the fixed @ext_info[] array is that everything is > clear, simple, and predictable (taken space and name length limits). > We could easily tell users what they will get for a given cost. Agreed. For non-optional data (such as your timestamps), I am in full agreement that a fixed array is the way to go. And it would only require a couple lines of code added to the ringbuffer. My concern is if we need to guarantee space for all possible dictionary data of all records. I think the dict_ring can be very helpful here. John Ogness
[PATCH 1/5] powerpc/tau: Use appropriate temperature sample interval
According to the MPC750 Users Manual, the SITV value in Thermal Management Register 3 is 13 bits long. The present code calculates the SITV value as 60 * 500 cycles. This would overflow to give 10 us on a 500 MHz CPU rather than the intended 60 us. (But according to the Microprocessor Datasheet, there is also a factor of 266 that has to be applied to this value on certain parts i.e. speed sort above 266 MHz.) Always use the maximum cycle count, as recommended by the Datasheet. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain --- arch/powerpc/include/asm/reg.h | 2 +- arch/powerpc/kernel/tau_6xx.c | 12 2 files changed, 5 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 88e6c78100d9b..c750afc62887c 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -815,7 +815,7 @@ #define THRM1_TIN (1 << 31) #define THRM1_TIV (1 << 30) #define THRM1_THRES(x) ((x&0x7f)<<23) -#define THRM3_SITV(x) ((x&0x3fff)<<1) +#define THRM3_SITV(x) ((x & 0x1fff) << 1) #define THRM1_TID (1<<2) #define THRM1_TIE (1<<1) #define THRM1_V(1<<0) diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c index e2ab8a111b693..976d5bc1b5176 100644 --- a/arch/powerpc/kernel/tau_6xx.c +++ b/arch/powerpc/kernel/tau_6xx.c @@ -178,15 +178,11 @@ static void tau_timeout(void * info) * complex sleep code needs to be added. One mtspr every time * tau_timeout is called is probably not a big deal. * -* Enable thermal sensor and set up sample interval timer -* need 20 us to do the compare.. until a nice 'cpu_speed' function -* call is implemented, just assume a 500 mhz clock. It doesn't really -* matter if we take too long for a compare since it's all interrupt -* driven anyway. -* -* use a extra long time.. (60 us @ 500 mhz) +* The "PowerPC 740 and PowerPC 750 Microprocessor Datasheet" +* recommends that "the maximum value be set in THRM3 under all +* conditions." */ - mtspr(SPRN_THRM3, THRM3_SITV(500*60) | THRM3_E); + mtspr(SPRN_THRM3, THRM3_SITV(0x1fff) | THRM3_E); local_irq_restore(flags); } -- 2.26.2
[PATCH 0/5] powerpc/tau: TAU driver fixes
This patch series fixes various bugs in the Thermal Assist Unit driver. It was tested on 266 MHz and 292 MHz PowerBook G3 laptops. Finn Thain (5): powerpc/tau: Use appropriate temperature sample interval powerpc/tau: Convert from timer to workqueue powerpc/tau: Remove duplicated set_thresholds() call powerpc/tau: Check processor type before enabling TAU interrupt powerpc/tau: Disable TAU between measurements arch/powerpc/include/asm/reg.h | 2 +- arch/powerpc/kernel/tau_6xx.c | 147 + arch/powerpc/platforms/Kconfig | 14 +--- 3 files changed, 62 insertions(+), 101 deletions(-) -- 2.26.2
[PATCH 3/5] powerpc/tau: Remove duplicated set_thresholds() call
The commentary at the call site seems to disagree with the code. The conditional prevents calling set_thresholds() via the exception handler, which appears to crash. Perhaps that's because it immediately triggers another TAU exception. Anyway, calling set_thresholds() from TAUupdate() is redundant because tau_timeout() does so. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain --- arch/powerpc/kernel/tau_6xx.c | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c index 268205cc347da..b8d7e7d498e0a 100644 --- a/arch/powerpc/kernel/tau_6xx.c +++ b/arch/powerpc/kernel/tau_6xx.c @@ -110,11 +110,6 @@ static void TAUupdate(int cpu) #ifdef DEBUG printk("grew = %d\n", tau[cpu].grew); #endif - -#ifndef CONFIG_TAU_INT /* tau_timeout will do this if not using interrupts */ - set_thresholds(cpu); -#endif - } #ifdef CONFIG_TAU_INT -- 2.26.2
[PATCH 2/5] powerpc/tau: Convert from timer to workqueue
Since commit 19dbdcb8039cf ("smp: Warn on function calls from softirq context") the Thermal Assist Unit driver causes a warning like the following when CONFIG_SMP is enabled. [ cut here ] WARNING: CPU: 0 PID: 0 at kernel/smp.c:428 smp_call_function_many_cond+0xf4/0x38c Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac #3 NIP: c00b37a8 LR: c00b3abc CTR: c001218c REGS: c0799c60 TRAP: 0700 Not tainted (5.7.0-pmac) MSR: 00029032 CR: 42000224 XER: GPR00: c00b3abc c0799d18 c076e300 c079ef5c c0011fec GPR08: 0100 0100 8000 42000224 c079d040 c079d044 GPR16: 0001 0004 c0799da0 c079f054 c07a c07a GPR24: c0011fec c079ef5c c079ef5c NIP [c00b37a8] smp_call_function_many_cond+0xf4/0x38c LR [c00b3abc] on_each_cpu+0x38/0x68 Call Trace: [c0799d18] [] 0x (unreliable) [c0799d68] [c00b3abc] on_each_cpu+0x38/0x68 [c0799d88] [c0096704] call_timer_fn.isra.26+0x20/0x7c [c0799d98] [c0096b40] run_timer_softirq+0x1d4/0x3fc [c0799df8] [c05b4368] __do_softirq+0x118/0x240 [c0799e58] [c0039c44] irq_exit+0xc4/0xcc [c0799e68] [c000ade8] timer_interrupt+0x1b0/0x230 [c0799ea8] [c0013520] ret_from_except+0x0/0x14 --- interrupt: 901 at arch_cpu_idle+0x24/0x6c LR = arch_cpu_idle+0x24/0x6c [c0799f70] [0001] 0x1 (unreliable) [c0799f80] [c0060990] do_idle+0xd8/0x17c [c0799fa0] [c0060ba8] cpu_startup_entry+0x24/0x28 [c0799fb0] [c072d220] start_kernel+0x434/0x44c [c0799ff0] [3860] 0x3860 Instruction dump: 8129f204 2f89 40beff98 3d20c07a 8929eec4 2f89 40beff88 0fe0 8122 552805de 550802ef 4182ff84 <0fe0> 3860 7f65db78 7f44d378 ---[ end trace 34a886e47819c2eb ]--- Don't call on_each_cpu() from a timer callback, call it from a worker thread instead. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain --- arch/powerpc/kernel/tau_6xx.c | 38 +-- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c index 976d5bc1b5176..268205cc347da 100644 --- a/arch/powerpc/kernel/tau_6xx.c +++ b/arch/powerpc/kernel/tau_6xx.c @@ -13,13 +13,14 @@ */ #include -#include #include #include #include #include #include #include +#include +#include #include #include @@ -39,8 +40,6 @@ static struct tau_temp unsigned char grew; } tau[NR_CPUS]; -struct timer_list tau_timer; - #undef DEBUG /* TODO: put these in a /proc interface, with some sanity checks, and maybe @@ -50,7 +49,7 @@ struct timer_list tau_timer; #define step_size 2 /* step size when temp goes out of range */ #define window_expand 1 /* expand the window by this much */ /* configurable values for shrinking the window */ -#define shrink_timer 2*HZ/* period between shrinking the window */ +#define shrink_timer 2000/* period between shrinking the window */ #define min_window 2 /* minimum window size, degrees C */ static void set_thresholds(unsigned long cpu) @@ -187,14 +186,18 @@ static void tau_timeout(void * info) local_irq_restore(flags); } -static void tau_timeout_smp(struct timer_list *unused) -{ +static struct workqueue_struct *tau_workq; - /* schedule ourselves to be run again */ - mod_timer(_timer, jiffies + shrink_timer) ; +static void tau_work_func(struct work_struct *work) +{ + msleep(shrink_timer); on_each_cpu(tau_timeout, NULL, 0); + /* schedule ourselves to be run again */ + queue_work(tau_workq, work); } +DECLARE_WORK(tau_work, tau_work_func); + /* * setup the TAU * @@ -227,21 +230,16 @@ static int __init TAU_init(void) return 1; } - - /* first, set up the window shrinking timer */ - timer_setup(_timer, tau_timeout_smp, 0); - tau_timer.expires = jiffies + shrink_timer; - add_timer(_timer); + tau_workq = alloc_workqueue("tau", WQ_UNBOUND, 1, 0); + if (!tau_workq) + return -ENOMEM; on_each_cpu(TAU_init_smp, NULL, 0); - printk("Thermal assist unit "); -#ifdef CONFIG_TAU_INT - printk("using interrupts, "); -#else - printk("using timers, "); -#endif - printk("shrink_timer: %d jiffies\n", shrink_timer); + queue_work(tau_workq, _work); + + pr_info("Thermal assist unit using %s, shrink_timer: %d ms\n", + IS_ENABLED(CONFIG_TAU_INT) ? "interrupts" : "workqueue", shrink_timer); tau_initialized = 1; return 0; -- 2.26.2
[PATCH 5/5] powerpc/tau: Disable TAU between measurements
Enabling CONFIG_TAU_INT causes random crashes: Unrecoverable exception 1700 at c0009414 (msr=1000) Oops: Unrecoverable exception, sig: 6 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac-00043-gd5f545e1a8593 #5 NIP: c0009414 LR: c0009414 CTR: c00116fc REGS: c0799eb8 TRAP: 1700 Not tainted (5.7.0-pmac-00043-gd5f545e1a8593) MSR: 1000 CR: 22000228 XER: 0100 GPR00: c0799f70 c076e300 0080 0291c0ac 00e0 c076e300 00049032 GPR08: 0001 c00116fc dfbd3200 007f80a8 GPR16: c075ce04 GPR24: c075ce04 dfff8880 c07b c075ce04 0008 0001 c079ef98 c079ef5c NIP [c0009414] arch_cpu_idle+0x24/0x6c LR [c0009414] arch_cpu_idle+0x24/0x6c Call Trace: [c0799f70] [0001] 0x1 (unreliable) [c0799f80] [c0060990] do_idle+0xd8/0x17c [c0799fa0] [c0060ba4] cpu_startup_entry+0x20/0x28 [c0799fb0] [c072d220] start_kernel+0x434/0x44c [c0799ff0] [3860] 0x3860 Instruction dump: 3d20c07b 7c0802a6 4e800421 7d2000a6 ---[ end trace 3a0c9b5cb216db6b ]--- Resolve this problem by disabling each THRMn comparator when handling the associated THRMn interrupt and by disabling the TAU entirely when updating THRMn thresholds. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain --- arch/powerpc/kernel/tau_6xx.c | 65 +- arch/powerpc/platforms/Kconfig | 9 ++--- 2 files changed, 26 insertions(+), 48 deletions(-) diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c index 614b5b272d9c6..0b4694b8d2482 100644 --- a/arch/powerpc/kernel/tau_6xx.c +++ b/arch/powerpc/kernel/tau_6xx.c @@ -42,8 +42,6 @@ static struct tau_temp static bool tau_int_enable; -#undef DEBUG - /* TODO: put these in a /proc interface, with some sanity checks, and maybe * dynamic adjustment to minimize # of interrupts */ /* configurable values for step size and how much to expand the window when @@ -67,42 +65,33 @@ static void set_thresholds(unsigned long cpu) static void TAUupdate(int cpu) { - unsigned thrm; - -#ifdef DEBUG - printk("TAUupdate "); -#endif + u32 thrm; + u32 bits = THRM1_TIV | THRM1_TIN | THRM1_V; /* if both thresholds are crossed, the step_sizes cancel out * and the window winds up getting expanded twice. */ - if((thrm = mfspr(SPRN_THRM1)) & THRM1_TIV){ /* is valid? */ - if(thrm & THRM1_TIN){ /* crossed low threshold */ - if (tau[cpu].low >= step_size){ - tau[cpu].low -= step_size; - tau[cpu].high -= (step_size - window_expand); - } - tau[cpu].grew = 1; -#ifdef DEBUG - printk("low threshold crossed "); -#endif + thrm = mfspr(SPRN_THRM1); + if ((thrm & bits) == bits) { + mtspr(SPRN_THRM1, 0); + + if (tau[cpu].low >= step_size) { + tau[cpu].low -= step_size; + tau[cpu].high -= (step_size - window_expand); } + tau[cpu].grew = 1; + pr_debug("%s: low threshold crossed\n", __func__); } - if((thrm = mfspr(SPRN_THRM2)) & THRM1_TIV){ /* is valid? */ - if(thrm & THRM1_TIN){ /* crossed high threshold */ - if (tau[cpu].high <= 127-step_size){ - tau[cpu].low += (step_size - window_expand); - tau[cpu].high += step_size; - } - tau[cpu].grew = 1; -#ifdef DEBUG - printk("high threshold crossed "); -#endif + thrm = mfspr(SPRN_THRM2); + if ((thrm & bits) == bits) { + mtspr(SPRN_THRM2, 0); + + if (tau[cpu].high <= 127 - step_size) { + tau[cpu].low += (step_size - window_expand); + tau[cpu].high += step_size; } + tau[cpu].grew = 1; + pr_debug("%s: high threshold crossed\n", __func__); } - -#ifdef DEBUG - printk("grew = %d\n", tau[cpu].grew); -#endif } #ifdef CONFIG_TAU_INT @@ -127,17 +116,17 @@ void TAUException(struct pt_regs * regs) static void tau_timeout(void * info) { int cpu; - unsigned long flags; int size; int shrink; - /* disabling interrupts *should* be okay */ - local_irq_save(flags); cpu = smp_processor_id(); if (!tau_int_enable) TAUupdate(cpu); + /* Stop thermal sensor comparisons and interrupts */ + mtspr(SPRN_THRM3, 0); + size = tau[cpu].high - tau[cpu].low; if (size > min_window && !
[PATCH 4/5] powerpc/tau: Check processor type before enabling TAU interrupt
According to Freescale's documentation, MPC74XX processors have an erratum that prevents the TAU interrupt from working, so don't try to use it when running on those processors. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Tested-by: Stan Johnson Signed-off-by: Finn Thain --- arch/powerpc/kernel/tau_6xx.c | 33 ++--- arch/powerpc/platforms/Kconfig | 5 ++--- 2 files changed, 16 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c index b8d7e7d498e0a..614b5b272d9c6 100644 --- a/arch/powerpc/kernel/tau_6xx.c +++ b/arch/powerpc/kernel/tau_6xx.c @@ -40,6 +40,8 @@ static struct tau_temp unsigned char grew; } tau[NR_CPUS]; +static bool tau_int_enable; + #undef DEBUG /* TODO: put these in a /proc interface, with some sanity checks, and maybe @@ -54,22 +56,13 @@ static struct tau_temp static void set_thresholds(unsigned long cpu) { -#ifdef CONFIG_TAU_INT - /* -* setup THRM1, -* threshold, valid bit, enable interrupts, interrupt when below threshold -*/ - mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | THRM1_TIE | THRM1_TID); + u32 maybe_tie = tau_int_enable ? THRM1_TIE : 0; - /* setup THRM2, -* threshold, valid bit, enable interrupts, interrupt when above threshold -*/ - mtspr (SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V | THRM1_TIE); -#else - /* same thing but don't enable interrupts */ - mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | THRM1_TID); - mtspr(SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V); -#endif + /* setup THRM1, threshold, valid bit, interrupt when below threshold */ + mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | maybe_tie | THRM1_TID); + + /* setup THRM2, threshold, valid bit, interrupt when above threshold */ + mtspr(SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V | maybe_tie); } static void TAUupdate(int cpu) @@ -142,9 +135,8 @@ static void tau_timeout(void * info) local_irq_save(flags); cpu = smp_processor_id(); -#ifndef CONFIG_TAU_INT - TAUupdate(cpu); -#endif + if (!tau_int_enable) + TAUupdate(cpu); size = tau[cpu].high - tau[cpu].low; if (size > min_window && ! tau[cpu].grew) { @@ -225,6 +217,9 @@ static int __init TAU_init(void) return 1; } + tau_int_enable = IS_ENABLED(CONFIG_TAU_INT) && +!strcmp(cur_cpu_spec->platform, "ppc750"); + tau_workq = alloc_workqueue("tau", WQ_UNBOUND, 1, 0); if (!tau_workq) return -ENOMEM; @@ -234,7 +229,7 @@ static int __init TAU_init(void) queue_work(tau_workq, _work); pr_info("Thermal assist unit using %s, shrink_timer: %d ms\n", - IS_ENABLED(CONFIG_TAU_INT) ? "interrupts" : "workqueue", shrink_timer); + tau_int_enable ? "interrupts" : "workqueue", shrink_timer); tau_initialized = 1; return 0; diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig index fb7515b4fa9c6..9fe36f0b54c1a 100644 --- a/arch/powerpc/platforms/Kconfig +++ b/arch/powerpc/platforms/Kconfig @@ -223,9 +223,8 @@ config TAU temperature within 2-4 degrees Celsius. This option shows the current on-die temperature in /proc/cpuinfo if the cpu supports it. - Unfortunately, on some chip revisions, this sensor is very inaccurate - and in many cases, does not work at all, so don't assume the cpu - temp is actually what /proc/cpuinfo says it is. + Unfortunately, this sensor is very inaccurate when uncalibrated, so + don't assume the cpu temp is actually what /proc/cpuinfo says it is. config TAU_INT bool "Interrupt driven TAU driver (DANGEROUS)" -- 2.26.2
Re: [RFC PATCH] certs: Add EFI_CERT_X509_GUID support for dbx entries]
> On Sep 4, 2020, at 6:59 AM, Jarkko Sakkinen > wrote: > > On Tue, Sep 01, 2020 at 12:51:43PM -0400, Eric Snowberg wrote: >> The Secure Boot Forbidden Signature Database, dbx, contains a list of now >> revoked signatures and keys previously approved to boot with UEFI Secure >> Boot enabled. The dbx is capable of containing any number of >> EFI_CERT_X509_SHA256_GUID, EFI_CERT_SHA256_GUID, and EFI_CERT_X509_GUID >> entries. >> >> Currently when EFI_CERT_X509_GUID are contained in the dbx, the entries are >> skipped. >> >> This change adds support for EFI_CERT_X509_GUID dbx entries. When a >> EFI_CERT_X509_GUID is found, it is added as an asymmetrical key to the >> .blacklist keyring. Anytime the .platform keyring is used, the keys in >> the .blacklist keyring are referenced, if a matching key is found, the >> key will be rejected. >> >> Signed-off-by: Eric Snowberg > > In the last paragraph, please use imperative form: "Add support for …". I will change this in V2. > >> --- >> certs/blacklist.c | 36 +++ >> certs/system_keyring.c| 6 >> include/keys/system_keyring.h | 11 ++ >> .../platform_certs/keyring_handler.c | 11 ++ >> 4 files changed, 64 insertions(+) >> >> diff --git a/certs/blacklist.c b/certs/blacklist.c >> index 6514f9ebc943..17ebf50cf0ae 100644 >> --- a/certs/blacklist.c >> +++ b/certs/blacklist.c >> @@ -15,6 +15,7 @@ >> #include >> #include >> #include >> +#include >> #include "blacklist.h" >> >> static struct key *blacklist_keyring; >> @@ -100,6 +101,41 @@ int mark_hash_blacklisted(const char *hash) >> return 0; >> } >> >> +int mark_key_revocationlisted(const char *data, size_t size) >> +{ >> +key_ref_t key; >> + >> +key = key_create_or_update(make_key_ref(blacklist_keyring, true), >> + "asymmetric", >> + NULL, >> + data, >> + size, >> + ((KEY_POS_ALL & ~KEY_POS_SETATTR) | >> +KEY_USR_VIEW), >> + KEY_ALLOC_NOT_IN_QUOTA | >> + KEY_ALLOC_BUILT_IN); >> + >> +if (IS_ERR(key)) { >> +pr_err("Problem with revocation key (%ld)\n", PTR_ERR(key)); >> +return PTR_ERR(key); >> +} >> + >> +return 0; >> +} >> + >> +int is_key_revocationlisted(struct pkcs7_message *pkcs7) >> +{ >> +int ret; >> + >> +ret = pkcs7_validate_trust(pkcs7, blacklist_keyring); >> + >> +if (ret == 0) >> +return -EKEYREJECTED; >> + >> +return -ENOKEY; >> +} >> +EXPORT_SYMBOL_GPL(is_key_revocationlisted); >> + >> /** >> * is_hash_blacklisted - Determine if a hash is blacklisted >> * @hash: The hash to be checked as a binary blob >> diff --git a/certs/system_keyring.c b/certs/system_keyring.c >> index 798291177186..f8ea96219155 100644 >> --- a/certs/system_keyring.c >> +++ b/certs/system_keyring.c >> @@ -241,6 +241,12 @@ int verify_pkcs7_message_sig(const void *data, size_t >> len, >> pr_devel("PKCS#7 platform keyring is not available\n"); >> goto error; >> } >> + >> +ret = is_key_revocationlisted(pkcs7); >> +if (ret != -ENOKEY) { >> +pr_devel("PKCS#7 platform key revocationlisted\n"); >> +goto error; >> +} >> } >> ret = pkcs7_validate_trust(pkcs7, trusted_keys); >> if (ret < 0) { >> diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h >> index fb8b07daa9d1..b6991cfe1b6d 100644 >> --- a/include/keys/system_keyring.h >> +++ b/include/keys/system_keyring.h >> @@ -31,11 +31,14 @@ extern int >> restrict_link_by_builtin_and_secondary_trusted( >> #define restrict_link_by_builtin_and_secondary_trusted >> restrict_link_by_builtin_trusted >> #endif >> >> +extern struct pkcs7_message *pkcs7; >> #ifdef CONFIG_SYSTEM_BLACKLIST_KEYRING >> extern int mark_hash_blacklisted(const char *hash); >> +extern int mark_key_revocationlisted(const char *data, size_t size); >> extern int is_hash_blacklisted(const u8 *hash, size_t hash_len, >> const char *type); >> extern int is_binary_blacklisted(const u8 *hash, size_t hash_len); >> +extern int is_key_revocationlisted(struct pkcs7_message *pkcs7); >> #else >> static inline int is_hash_blacklisted(const u8 *hash, size_t hash_len, >>const char *type) >> @@ -47,6 +50,14 @@ static inline int is_binary_blacklisted(const u8 *hash, >> size_t hash_len) >> { >> return 0; >> } >> +static inline int mark_key_revocationlisted(const char *data, size_t size) >> +{ >> +return 0; >> +} >> +static inline int is_key_revocationlisted(struct pkcs7_message *pkcs7) >> +{ >> +return -ENOKEY; >> +} >> #endif >> >> #ifdef CONFIG_IMA_BLACKLIST_KEYRING
Re: [PATCH v13 5/5] remoteproc: Add initial zynqmp R5 remoteproc driver
Hello Ben, On Fri, Sep 04, 2020 at 07:32:09AM -0700, Ben Levinsky wrote: > R5 is included in Xilinx Zynq UltraScale MPSoC so by adding this > remotproc driver, we can boot the R5 sub-system in different 2 > configurations: split or lock-step. > > The Xilinx R5 Remoteproc Driver boots the R5's via calls to the Xilinx > Platform Management Unit that handles the R5 configuration, memory access > and R5 lifecycle management. The interface to this manager is done in this > driver via zynqmp_pm_* function calls. > > Signed-off-by: Wendy Liang > Signed-off-by: Michal Simek > Signed-off-by: Ed Mooring > Signed-off-by: Jason Wu > Signed-off-by: Ben Levinsky > --- > v2: > - remove domain struct as per review from Mathieu > v3: > - add xilinx-related platform mgmt fn's instead of wrapping around >function pointer in xilinx eemi ops struct > v4: > - add default values for enums > - fix formatting as per checkpatch.pl --strict. Note that 1 warning and 1 > check >are still raised as each is due to fixing the warning results in that > particular line going over 80 characters. > v5: > - parse_fw change from use of rproc_of_resm_mem_entry_init to > rproc_mem_entry_init and use of alloc/release > - var's of type zynqmp_r5_pdata all have same local variable name > - use dev_dbg instead of dev_info > v6: > - adding memory carveouts is handled much more similarly. All mem > carveouts are >now described in reserved memory as needed. That is, TCM nodes are not >coupled to remoteproc anymore. This is reflected in the remoteproc R5 > driver >and the device tree binding. > - remove mailbox from device tree binding as it is not necessary for elf >loading > - use lockstep-mode property for configuring RPU > v7: > - remove unused headers > - change u32 *lockstep_mode -> u32 lockstep_mode; > - change device-tree binding "lockstep-mode" to xlnx,cluster-mode > - remove zynqmp_r5_mem_probe and loop to Probe R5 memory devices at >remoteproc-probe time > - remove is_r5_mode_set from zynqmp rpu remote processor private data > - do not error out if no mailbox is provided > - remove zynqmp_r5_remoteproc_probe call of platform_set_drvdata as > pdata is >handled in zynqmp_r5_remoteproc_remove > v8: > - remove old acks, reviewed-by's in commit message > v9: > - as mboxes are now optional, if pdata->tx_mc_skbs not initialized then > do not call skb_queue_empty > - update usage for zynqmp_pm_set_rpu_mode, zynqmp_pm_set_tcm_config and > zynqmp_pm_get_rpu_mode > - update 5/5 patch commit message to document supported configurations > and how they are booted by the driver. > - remove copyrights other than SPDX from zynqmp_r5_remoteproc.c > - compilation warnings no longer raised > - remove unused includes from zynqmp_r5_remoteproc.c > - remove unused var autoboot from zynqmp_r5_remoteproc.c > - reorder zynqmp_r5_pdata fpr small mem savings due to alignment > - use of zynqmp_pm_set_tcm_config now does not have > output arg > - in tcm handling, unconditionally use &= 0x000f mask since all nodes > in this fn are for tcm > - update comments for translating dma field in tcm handling to device > address > - update calls to rproc_mem_entry_init in parse_mem_regions so that there > are only 2 cases for types of carveouts instead of 3 > - in parse_mem_regions, check if device tree node is null before using it > - add example device tree nodes used in parse_mem_regions and tcm parsing > - add comment for vring id node length > - add check for string length so that vring id is at least min length > - move tcm nodes from reserved mem to instead own device tree nodes >and only use them if enabled in device tree > - add comment for explaining handling of rproc_elf_load_rsc_table > - remove obsolete check for "if (vqid < 0)" in zynqmp_r5_rproc_kick > - remove unused field mems in struct zynqmp_r5_pdata > - remove call to zynqmp_r5_mem_probe and the fn itself as tcm handling > is done by zyqmp_r5_pm_request_tcm > - remove obsolete setting of dma_ops and parent device dma_mask > - remove obsolete use of of_dma_configure > - add comment for call to r5_set_mode fn > - make mbox usage optional and gracefully inform user via dev_dbg if not > present > - change var lockstep_mode from u32* to u32 > v11: > - use enums instead of u32 where possible in zynqmp_r5_remoteproc > - update usage of zynqmp_pm_set/get_rpu_mode and zynqmp_pm_set_tcm_config > - update prints to not use carriage return, just newline > - look up tcm banks via property in r5 node instead of string name > - print device tree nodes with %pOF instead of %s with node name field > - update tcm release to unmap VA > - handle r5-1 use case > v12: > - update signed off by so that latest developer name is last > - do not cast enums to u32s for zynqmp_pm* functions > --- > drivers/remoteproc/Kconfig| 10 + > drivers/remoteproc/Makefile | 1 + >
[PATCH net-next 9/9] net: ethernet: ti: ale: add support for multi port k3 cpsw versions
The TI J721E (CPSW9g) ALE version is similar, in general, to Sitara AM3/4/5 CPSW ALE, but has more extended functions and different ALE VLAN entry format. This patch adds support for for multi port TI J721E (CPSW9g) ALE variant. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/cpsw_ale.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 7ca46936a36c..40b6f740d62d 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -191,6 +191,14 @@ const struct ale_entry_fld vlan_entry_nu[ALE_ENT_VID_LAST] = { ALE_ENTRY_FLD(ALE_ENT_VID_REG_MCAST_IDX, 44, 3), }; +/* K3 j721e/j7200 cpsw9g/5g, am64x cpsw3g */ +const struct ale_entry_fld vlan_entry_k3_cpswxg[] = { + ALE_ENTRY_FLD_DYN_MSK_SIZE(ALE_ENT_VID_MEMBER_LIST, 0), + ALE_ENTRY_FLD_DYN_MSK_SIZE(ALE_ENT_VID_UNREG_MCAST_MSK, 12), + ALE_ENTRY_FLD_DYN_MSK_SIZE(ALE_ENT_VID_FORCE_UNTAGGED_MSK, 24), + ALE_ENTRY_FLD_DYN_MSK_SIZE(ALE_ENT_VID_REG_MCAST_MSK, 36), +}; + DEFINE_ALE_FIELD(entry_type, 60, 2) DEFINE_ALE_FIELD(vlan_id, 48, 12) DEFINE_ALE_FIELD(mcast_state, 62, 2) @@ -1213,6 +1221,12 @@ static const struct cpsw_ale_dev_id cpsw_ale_id_match[] = { .nu_switch_ale = true, .vlan_entry_tbl = vlan_entry_nu, }, + { + .dev_id = "j721e-cpswxg", + .features = CPSW_ALE_F_STATUS_REG | CPSW_ALE_F_HW_AUTOAGING, + .major_ver_mask = 0x7, + .vlan_entry_tbl = vlan_entry_k3_cpswxg, + }, { }, }; -- 2.17.1
[PATCH net-next 7/9] net: ethernet: ti: am65-cpsw: enable hw auto ageing
The AM65x ALE supports HW auto-ageing which can be enabled by programming ageing interval in ALE_AGING_TIMER register. For this CPSW fck_clk frequency has to be know by ALE. This patch extends cpsw_ale_params with bus_freq field and enables ALE HW auto ageing for AM65x CPSW2G ALE version. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 13 + drivers/net/ethernet/ti/am65-cpsw-nuss.h | 1 + drivers/net/ethernet/ti/cpsw_ale.c | 61 +--- drivers/net/ethernet/ti/cpsw_ale.h | 1 + 4 files changed, 70 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index bec47e794359..501d676fd88b 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -5,6 +5,7 @@ * */ +#include #include #include #include @@ -2038,6 +2039,7 @@ static int am65_cpsw_nuss_probe(struct platform_device *pdev) struct am65_cpsw_common *common; struct device_node *node; struct resource *res; + struct clk *clk; int ret, i; common = devm_kzalloc(dev, sizeof(struct am65_cpsw_common), GFP_KERNEL); @@ -2086,6 +2088,16 @@ static int am65_cpsw_nuss_probe(struct platform_device *pdev) if (!common->ports) return -ENOMEM; + clk = devm_clk_get(dev, "fck"); + if (IS_ERR(clk)) { + ret = PTR_ERR(clk); + + if (ret != -EPROBE_DEFER) + dev_err(dev, "error getting fck clock %d\n", ret); + return ret; + } + common->bus_freq = clk_get_rate(clk); + pm_runtime_enable(dev); ret = pm_runtime_get_sync(dev); if (ret < 0) { @@ -2134,6 +2146,7 @@ static int am65_cpsw_nuss_probe(struct platform_device *pdev) ale_params.ale_ports = common->port_num + 1; ale_params.ale_regs = common->cpsw_base + AM65_CPSW_NU_ALE_BASE; ale_params.dev_id = "am65x-cpsw2g"; + ale_params.bus_freq = common->bus_freq; common->ale = cpsw_ale_create(_params); if (IS_ERR(common->ale)) { diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.h b/drivers/net/ethernet/ti/am65-cpsw-nuss.h index 94f666ea0e53..993e1d4d3222 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.h +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.h @@ -106,6 +106,7 @@ struct am65_cpsw_common { u32 nuss_ver; u32 cpsw_ver; + unsigned long bus_freq; boolpf_p0_rx_ptype_rrobin; struct am65_cpts*cpts; int est_enabled; diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 524920a4bff0..7b54e9911b1e 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -32,6 +32,7 @@ #define ALE_STATUS 0x04 #define ALE_CONTROL0x08 #define ALE_PRESCALE 0x10 +#define ALE_AGING_TIMER0x14 #define ALE_UNKNOWNVLAN0x18 #define ALE_TABLE_CONTROL 0x20 #define ALE_TABLE 0x34 @@ -46,6 +47,9 @@ #define AM65_CPSW_ALE_THREAD_DEF_REG 0x134 +/* ALE_AGING_TIMER */ +#define ALE_AGING_TIMER_MASK GENMASK(23, 0) + enum { CPSW_ALE_F_STATUS_REG = BIT(0), /* Status register present */ CPSW_ALE_F_HW_AUTOAGING = BIT(1), /* HW auto aging */ @@ -982,21 +986,66 @@ static void cpsw_ale_timer(struct timer_list *t) } } +static void cpsw_ale_hw_aging_timer_start(struct cpsw_ale *ale) +{ + u32 aging_timer; + + aging_timer = ale->params.bus_freq / 100; + aging_timer *= ale->params.ale_ageout; + + if (aging_timer & ~ALE_AGING_TIMER_MASK) { + aging_timer = ALE_AGING_TIMER_MASK; + dev_warn(ale->params.dev, +"ALE aging timer overflow, set to max\n"); + } + + writel(aging_timer, ale->params.ale_regs + ALE_AGING_TIMER); +} + +static void cpsw_ale_hw_aging_timer_stop(struct cpsw_ale *ale) +{ + writel(0, ale->params.ale_regs + ALE_AGING_TIMER); +} + +static void cpsw_ale_aging_start(struct cpsw_ale *ale) +{ + if (!ale->params.ale_ageout) + return; + + if (ale->features & CPSW_ALE_F_HW_AUTOAGING) { + cpsw_ale_hw_aging_timer_start(ale); + return; + } + + timer_setup(>timer, cpsw_ale_timer, 0); + ale->timer.expires = jiffies + ale->ageout; + add_timer(>timer); +} + +static void cpsw_ale_aging_stop(struct cpsw_ale *ale) +{ + if (!ale->params.ale_ageout) + return; + + if (ale->features & CPSW_ALE_F_HW_AUTOAGING) { + cpsw_ale_hw_aging_timer_stop(ale); + return; + } + + del_timer_sync(>timer); +} + void cpsw_ale_start(struct cpsw_ale *ale) { cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1);
[PATCH net-next 8/9] net: ethernet: ti: ale: switch to use tables for vlan entry description
The ALE VLAN entries are too much differ between different TI CPSW ALE versions. So, handling them using flags, defines and get/set functions became over-complicated. This patch introduces tables to describe the ALE VLAN entries fields, which are different between TI CPSW ALE versions, and new get/set access functions. It also allows to detect incorrect access to not available ALL entry fields. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/cpsw_ale.c | 239 ++--- drivers/net/ethernet/ti/cpsw_ale.h | 3 + 2 files changed, 188 insertions(+), 54 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 7b54e9911b1e..7ca46936a36c 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -50,6 +50,18 @@ /* ALE_AGING_TIMER */ #define ALE_AGING_TIMER_MASK GENMASK(23, 0) +/** + * struct ale_entry_fld - The ALE tbl entry field description + * @start_bit: field start bit + * @u8 num_bits: field bit length + * @flags: field flags + */ +struct ale_entry_fld { + u8 start_bit; + u8 num_bits; + u8 flags; +}; + enum { CPSW_ALE_F_STATUS_REG = BIT(0), /* Status register present */ CPSW_ALE_F_HW_AUTOAGING = BIT(1), /* HW auto aging */ @@ -64,6 +76,7 @@ enum { * @tbl_entries: number of ALE entries * @major_ver_mask: mask of ALE Major Version Value in ALE_IDVER reg. * @nu_switch_ale: NU Switch ALE + * @vlan_entry_tbl: ALE vlan entry fields description tbl */ struct cpsw_ale_dev_id { const char *dev_id; @@ -71,6 +84,7 @@ struct cpsw_ale_dev_id { u32 tbl_entries; u32 major_ver_mask; bool nu_switch_ale; + const struct ale_entry_fld *vlan_entry_tbl; }; #define ALE_TABLE_WRITEBIT(31) @@ -132,6 +146,51 @@ static inline void cpsw_ale_set_##name(u32 *ale_entry, u32 value, \ cpsw_ale_set_field(ale_entry, start, bits, value); \ } +enum { + ALE_ENT_VID_MEMBER_LIST = 0, + ALE_ENT_VID_UNREG_MCAST_MSK, + ALE_ENT_VID_REG_MCAST_MSK, + ALE_ENT_VID_FORCE_UNTAGGED_MSK, + ALE_ENT_VID_UNREG_MCAST_IDX, + ALE_ENT_VID_REG_MCAST_IDX, + ALE_ENT_VID_LAST, +}; + +#define ALE_FLD_ALLOWEDBIT(0) +#define ALE_FLD_SIZE_PORT_MASK_BITSBIT(1) +#define ALE_FLD_SIZE_PORT_NUM_BITS BIT(2) + +#define ALE_ENTRY_FLD(id, start, bits) \ +[id] = { \ + .start_bit = start, \ + .num_bits = bits, \ + .flags = ALE_FLD_ALLOWED, \ +} + +#define ALE_ENTRY_FLD_DYN_MSK_SIZE(id, start) \ +[id] = { \ + .start_bit = start, \ + .num_bits = 0, \ + .flags = ALE_FLD_ALLOWED | \ +ALE_FLD_SIZE_PORT_MASK_BITS, \ +} + +/* dm814x, am3/am4/am5, k2hk */ +const struct ale_entry_fld vlan_entry_cpsw[ALE_ENT_VID_LAST] = { + ALE_ENTRY_FLD(ALE_ENT_VID_MEMBER_LIST, 0, 3), + ALE_ENTRY_FLD(ALE_ENT_VID_UNREG_MCAST_MSK, 8, 3), + ALE_ENTRY_FLD(ALE_ENT_VID_REG_MCAST_MSK, 16, 3), + ALE_ENTRY_FLD(ALE_ENT_VID_FORCE_UNTAGGED_MSK, 24, 3), +}; + +/* k2e/k2l, k3 am65/j721e cpsw2g */ +const struct ale_entry_fld vlan_entry_nu[ALE_ENT_VID_LAST] = { + ALE_ENTRY_FLD_DYN_MSK_SIZE(ALE_ENT_VID_MEMBER_LIST, 0), + ALE_ENTRY_FLD(ALE_ENT_VID_UNREG_MCAST_IDX, 20, 3), + ALE_ENTRY_FLD_DYN_MSK_SIZE(ALE_ENT_VID_FORCE_UNTAGGED_MSK, 24), + ALE_ENTRY_FLD(ALE_ENT_VID_REG_MCAST_IDX, 44, 3), +}; + DEFINE_ALE_FIELD(entry_type, 60, 2) DEFINE_ALE_FIELD(vlan_id, 48, 12) DEFINE_ALE_FIELD(mcast_state, 62, 2) @@ -141,17 +200,76 @@ DEFINE_ALE_FIELD(ucast_type, 62, 2) DEFINE_ALE_FIELD1(port_num,66) DEFINE_ALE_FIELD(blocked, 65, 1) DEFINE_ALE_FIELD(secure, 64, 1) -DEFINE_ALE_FIELD1(vlan_untag_force,24) -DEFINE_ALE_FIELD1(vlan_reg_mcast, 16) -DEFINE_ALE_FIELD1(vlan_unreg_mcast,8) -DEFINE_ALE_FIELD1(vlan_member_list,0) DEFINE_ALE_FIELD(mcast,40, 1) -/* ALE NetCP nu switch specific */ -DEFINE_ALE_FIELD(vlan_unreg_mcast_idx, 20, 3) -DEFINE_ALE_FIELD(vlan_reg_mcast_idx, 44, 3) #define NU_VLAN_UNREG_MCAST_IDX1 +static int cpsw_ale_entry_get_fld(struct cpsw_ale *ale, + u32 *ale_entry, + const struct ale_entry_fld *entry_tbl, + int fld_id) +{ + const struct ale_entry_fld *entry_fld; + u32 bits; + + if (!ale || !ale_entry) + return -EINVAL; + + entry_fld = _tbl[fld_id]; + if (!(entry_fld->flags & ALE_FLD_ALLOWED)) { + dev_err(ale->params.dev, "get: wrong ale fld id %d\n", fld_id); + return -ENOENT; + } + + bits =
[PATCH net-next 6/9] net: ethernet: ti: ale: make usage of ale dev_id mandatory
Hence all existing driver updated to use ALE dev_id the usage of ale dev_id can be made mandatory and cpsw_ale_create() can be updated to use "features" property from ALE static configuration. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/cpsw_ale.c | 28 +--- drivers/net/ethernet/ti/cpsw_ale.h | 1 + 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 766197003971..524920a4bff0 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -83,7 +83,6 @@ struct cpsw_ale_dev_id { #define ALE_TABLE_SIZE_MULTIPLIER 1024 #define ALE_STATUS_SIZE_MASK 0x1f -#define ALE_TABLE_SIZE_DEFAULT 64 static inline int cpsw_ale_get_field(u32 *ale_entry, u32 start, u32 bits) { @@ -1060,11 +1059,12 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) u32 rev, ale_entries; ale_dev_id = cpsw_ale_match_id(cpsw_ale_id_match, params->dev_id); - if (ale_dev_id) { - params->ale_entries = ale_dev_id->tbl_entries; - params->major_ver_mask = ale_dev_id->major_ver_mask; - params->nu_switch_ale = ale_dev_id->nu_switch_ale; - } + if (!ale_dev_id) + return ERR_PTR(-EINVAL); + + params->ale_entries = ale_dev_id->tbl_entries; + params->major_ver_mask = ale_dev_id->major_ver_mask; + params->nu_switch_ale = ale_dev_id->nu_switch_ale; ale = devm_kzalloc(params->dev, sizeof(*ale), GFP_KERNEL); if (!ale) @@ -1079,6 +1079,7 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) ale->params = *params; ale->ageout = ale->params.ale_ageout * HZ; + ale->features = ale_dev_id->features; rev = readl_relaxed(ale->params.ale_regs + ALE_IDVER); ale->version = @@ -1088,7 +1089,8 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) ALE_VERSION_MAJOR(rev, ale->params.major_ver_mask), ALE_VERSION_MINOR(rev)); - if (!ale->params.ale_entries) { + if (ale->features & CPSW_ALE_F_STATUS_REG && + !ale->params.ale_entries) { ale_entries = readl_relaxed(ale->params.ale_regs + ALE_STATUS) & ALE_STATUS_SIZE_MASK; @@ -1097,16 +1099,12 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) * table which shows the size as a multiple of 1024 entries. * For these, params.ale_entries will be set to zero. So * read the register and update the value of ale_entries. -* ALE table on NetCP lite, is much smaller and is indicated -* by a value of zero in ALE_STATUS. So use a default value -* of ALE_TABLE_SIZE_DEFAULT for this. Caller is expected -* to set the value of ale_entries for all other versions -* of ALE. +* return error if ale_entries is zero in ALE_STATUS. */ if (!ale_entries) - ale_entries = ALE_TABLE_SIZE_DEFAULT; - else - ale_entries *= ALE_TABLE_SIZE_MULTIPLIER; + return ERR_PTR(-EINVAL); + + ale_entries *= ALE_TABLE_SIZE_MULTIPLIER; ale->params.ale_entries = ale_entries; } dev_info(ale->params.dev, diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h index 53ad4246617e..27b30802b384 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.h +++ b/drivers/net/ethernet/ti/cpsw_ale.h @@ -32,6 +32,7 @@ struct cpsw_ale { struct timer_list timer; unsigned long ageout; u32 version; + u32 features; /* These bits are different on NetCP NU Switch ALE */ u32 port_mask_bits; u32 port_num_bits; -- 2.17.1
[PATCH net-next 4/9] net: netcp: ethss: use dev_id for ale configuration
The previous patch has introduced possibility to select CPSW ALE by using ALE dev_id identifier. Switch TI Keystone 2 NETCP driver to use dev_id and perform clean up by removing "ale_entries" configuration code. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/netcp_ethss.c | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c index 28093923a7fb..33c1592d5381 100644 --- a/drivers/net/ethernet/ti/netcp_ethss.c +++ b/drivers/net/ethernet/ti/netcp_ethss.c @@ -51,7 +51,6 @@ #define GBE13_CPTS_OFFSET 0x500 #define GBE13_ALE_OFFSET 0x600 #define GBE13_HOST_PORT_NUM0 -#define GBE13_NUM_ALE_ENTRIES 1024 /* 1G Ethernet NU SS defines */ #define GBENU_MODULE_NAME "netcp-gbenu" @@ -101,7 +100,6 @@ #define XGBE10_ALE_OFFSET 0x700 #define XGBE10_HW_STATS_OFFSET 0x800 #define XGBE10_HOST_PORT_NUM 0 -#define XGBE10_NUM_ALE_ENTRIES 2048 #defineGBE_TIMER_INTERVAL (HZ / 2) @@ -711,7 +709,6 @@ struct gbe_priv { struct netcp_device *netcp_device; struct timer_list timer; u32 num_slaves; - u32 ale_entries; u32 ale_ports; boolenable_ale; u8 max_num_slaves; @@ -3309,7 +3306,6 @@ static int set_xgbe_ethss10_priv(struct gbe_priv *gbe_dev, gbe_dev->cpts_reg = gbe_dev->switch_regs + XGBE10_CPTS_OFFSET; gbe_dev->ale_ports = gbe_dev->max_num_ports; gbe_dev->host_port = XGBE10_HOST_PORT_NUM; - gbe_dev->ale_entries = XGBE10_NUM_ALE_ENTRIES; gbe_dev->stats_en_mask = (1 << (gbe_dev->max_num_ports)) - 1; /* Subsystem registers */ @@ -3433,7 +3429,6 @@ static int set_gbe_ethss14_priv(struct gbe_priv *gbe_dev, gbe_dev->ale_reg = gbe_dev->switch_regs + GBE13_ALE_OFFSET; gbe_dev->ale_ports = gbe_dev->max_num_ports; gbe_dev->host_port = GBE13_HOST_PORT_NUM; - gbe_dev->ale_entries = GBE13_NUM_ALE_ENTRIES; gbe_dev->stats_en_mask = GBE13_REG_VAL_STAT_ENABLE_ALL; /* Subsystem registers */ @@ -3697,12 +3692,15 @@ static int gbe_probe(struct netcp_device *netcp_device, struct device *dev, ale_params.dev = gbe_dev->dev; ale_params.ale_regs = gbe_dev->ale_reg; ale_params.ale_ageout = GBE_DEFAULT_ALE_AGEOUT; - ale_params.ale_entries = gbe_dev->ale_entries; ale_params.ale_ports= gbe_dev->ale_ports; - if (IS_SS_ID_MU(gbe_dev)) { - ale_params.major_ver_mask = 0x7; - ale_params.nu_switch_ale = true; - } + ale_params.dev_id = "cpsw"; + if (IS_SS_ID_NU(gbe_dev)) + ale_params.dev_id = "66ak2el"; + else if (IS_SS_ID_2U(gbe_dev)) + ale_params.dev_id = "66ak2g"; + else if (IS_SS_ID_XGBE(gbe_dev)) + ale_params.dev_id = "66ak2h-xgbe"; + gbe_dev->ale = cpsw_ale_create(_params); if (IS_ERR(gbe_dev->ale)) { dev_err(gbe_dev->dev, "error initializing ale engine\n"); -- 2.17.1
[PATCH net-next 5/9] net: ethernet: ti: am65-cpsw: use dev_id for ale configuration
The previous patch has introduced possibility to select CPSW ALE by using ALE dev_id identifier. Switch TI TI AM65x/J721E CPSW NUSS driver to use dev_id. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/am65-cpsw-nuss.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index 9baf3f3da91e..bec47e794359 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -2131,10 +2131,9 @@ static int am65_cpsw_nuss_probe(struct platform_device *pdev) /* init common data */ ale_params.dev = dev; ale_params.ale_ageout = AM65_CPSW_ALE_AGEOUT_DEFAULT; - ale_params.ale_entries = 0; ale_params.ale_ports = common->port_num + 1; ale_params.ale_regs = common->cpsw_base + AM65_CPSW_NU_ALE_BASE; - ale_params.nu_switch_ale = true; + ale_params.dev_id = "am65x-cpsw2g"; common->ale = cpsw_ale_create(_params); if (IS_ERR(common->ale)) { -- 2.17.1
[PATCH net-next 1/9] net: ethernet: ti: ale: add cpsw_ale_get_num_entries api
Add cpsw_ale_get_num_entries() API to return number of ALE table entries and update existing drivers to use it. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/am65-cpsw-ethtool.c | 10 ++ drivers/net/ethernet/ti/cpsw_ale.c | 5 + drivers/net/ethernet/ti/cpsw_ale.h | 1 + drivers/net/ethernet/ti/cpsw_ethtool.c | 3 ++- 4 files changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/ti/am65-cpsw-ethtool.c b/drivers/net/ethernet/ti/am65-cpsw-ethtool.c index 496dafb25128..6e4d4f9e32e0 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-ethtool.c +++ b/drivers/net/ethernet/ti/am65-cpsw-ethtool.c @@ -572,13 +572,14 @@ static int am65_cpsw_nway_reset(struct net_device *ndev) static int am65_cpsw_get_regs_len(struct net_device *ndev) { struct am65_cpsw_common *common = am65_ndev_to_common(ndev); - u32 i, regdump_len = 0; + u32 ale_entries, i, regdump_len = 0; + ale_entries = cpsw_ale_get_num_entries(common->ale); for (i = 0; i < ARRAY_SIZE(am65_cpsw_regdump); i++) { if (am65_cpsw_regdump[i].hdr.module_id == AM65_CPSW_REGDUMP_MOD_CPSW_ALE_TBL) { regdump_len += sizeof(struct am65_cpsw_regdump_hdr); - regdump_len += common->ale->params.ale_entries * + regdump_len += ale_entries * ALE_ENTRY_WORDS * sizeof(u32); continue; } @@ -592,10 +593,11 @@ static void am65_cpsw_get_regs(struct net_device *ndev, struct ethtool_regs *regs, void *p) { struct am65_cpsw_common *common = am65_ndev_to_common(ndev); - u32 i, j, pos, *reg = p; + u32 ale_entries, i, j, pos, *reg = p; /* update CPSW IP version */ regs->version = AM65_CPSW_REGDUMP_VER; + ale_entries = cpsw_ale_get_num_entries(common->ale); pos = 0; for (i = 0; i < ARRAY_SIZE(am65_cpsw_regdump); i++) { @@ -603,7 +605,7 @@ static void am65_cpsw_get_regs(struct net_device *ndev, if (am65_cpsw_regdump[i].hdr.module_id == AM65_CPSW_REGDUMP_MOD_CPSW_ALE_TBL) { - u32 ale_tbl_len = common->ale->params.ale_entries * + u32 ale_tbl_len = ale_entries * ALE_ENTRY_WORDS * sizeof(u32) + sizeof(struct am65_cpsw_regdump_hdr); reg[pos++] = ale_tbl_len; diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 9ad872bfae3a..a94aef3f54a5 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -1079,3 +1079,8 @@ void cpsw_ale_dump(struct cpsw_ale *ale, u32 *data) data += ALE_ENTRY_WORDS; } } + +u32 cpsw_ale_get_num_entries(struct cpsw_ale *ale) +{ + return ale ? ale->params.ale_entries : 0; +} diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h index 6a3cb6898728..735692f066bf 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.h +++ b/drivers/net/ethernet/ti/cpsw_ale.h @@ -119,6 +119,7 @@ int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control); int cpsw_ale_control_set(struct cpsw_ale *ale, int port, int control, int value); void cpsw_ale_dump(struct cpsw_ale *ale, u32 *data); +u32 cpsw_ale_get_num_entries(struct cpsw_ale *ale); static inline int cpsw_ale_get_vlan_p0_untag(struct cpsw_ale *ale, u16 vid) { diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c index fa54efe3be63..4d02c5135611 100644 --- a/drivers/net/ethernet/ti/cpsw_ethtool.c +++ b/drivers/net/ethernet/ti/cpsw_ethtool.c @@ -339,7 +339,8 @@ int cpsw_get_regs_len(struct net_device *ndev) { struct cpsw_common *cpsw = ndev_to_cpsw(ndev); - return cpsw->data.ale_entries * ALE_ENTRY_WORDS * sizeof(u32); + return cpsw_ale_get_num_entries(cpsw->ale) * + ALE_ENTRY_WORDS * sizeof(u32); } void cpsw_get_regs(struct net_device *ndev, struct ethtool_regs *regs, void *p) -- 2.17.1
[PATCH net-next 0/9] net: ethernet: ti: ale: add static configuration
Hi All, As existing, as newly introduced CPSW ALE versions have differences in supported features and ALE table formats. Especially it's actual for the recent AM65x/J721E/J7200 and future AM64x SoCs, which supports more features like: auto-aging, classifiers, Link aggregation, additional HW filtering, etc. The existing ALE configuration interface is not practical in terms of adding new features and requires consumers to program a lot static parameters. And any attempt to add new features will case endless adding and maintaining different combination of flags and options. Because CPSW ALE configuration is static and fixed for SoC (or set of SoC), It is reasonable to add support for static ALE configurations inside ALE module. This series introduces static ALE configuration table for different ALE variants and provides option for consumers to select required ALE configuration by providing ALE const char *dev_id identifier (Patch 2). And all existing driver have been switched to use new approach (Patches 3-6). After this ALE HW auto-ageing feature can be enabled for AM65x CPSW ALE variant (Patch 7). Finally, Patches 8-9 introduces tables to describe the ALE VLAN entries fields as the ALE VLAN entries are too much differ between different TI CPSW ALE versions. So, handling them using flags, defines and get/set functions are became over-complicated. Patch 1 - is preparation patch Grygorii Strashko (9): net: ethernet: ti: ale: add cpsw_ale_get_num_entries api net: ethernet: ti: ale: add static configuration net: ethernet: ti: cpsw: use dev_id for ale configuration net: netcp: ethss: use dev_id for ale configuration net: ethernet: ti: am65-cpsw: use dev_id for ale configuration net: ethernet: ti: ale: make usage of ale dev_id mandatory net: ethernet: ti: am65-cpsw: enable hw auto ageing net: ethernet: ti: ale: switch to use tables for vlan entry description net: ethernet: ti: ale: add support for multi port k3 cpsw versions drivers/net/ethernet/ti/am65-cpsw-ethtool.c | 10 +- drivers/net/ethernet/ti/am65-cpsw-nuss.c| 16 +- drivers/net/ethernet/ti/am65-cpsw-nuss.h| 1 + drivers/net/ethernet/ti/cpsw.c | 6 - drivers/net/ethernet/ti/cpsw_ale.c | 421 drivers/net/ethernet/ti/cpsw_ale.h | 7 + drivers/net/ethernet/ti/cpsw_ethtool.c | 3 +- drivers/net/ethernet/ti/cpsw_new.c | 1 - drivers/net/ethernet/ti/cpsw_priv.c | 2 +- drivers/net/ethernet/ti/cpsw_priv.h | 2 - drivers/net/ethernet/ti/netcp_ethss.c | 18 +- 11 files changed, 388 insertions(+), 99 deletions(-) -- 2.17.1
[PATCH net-next 2/9] net: ethernet: ti: ale: add static configuration
As existing, as newly introduced CPSW ALE versions have differences in supported features and ALE table formats. Especially it's actual for the recent AM65x/J721E/J7200 and future AM64x SoCs, which supports features like: auto-aging, classifiers, Link aggregation, additional HW filtering, etc. The existing ALE configuration interface is not practical in terms of adding new features and requires consumers to program a lot static parameters. Any attempt to add new options will case endless adding and maintaining different combination of flags and options. Hence CPSW ALE configuration is static and fixed for SoC (or set of SoC) It is reasonable to add support for static ALE configurations inside ALE module. This patch adds static ALE configuration table for different ALE versions and provides option for consumers to select required ALE configuration by providing ALE const char *dev_id identifier. This feature is not enabled by default until existing CPSW drivers will be modified by follow up patches. Signed-off-by: Grygorii Strashko --- drivers/net/ethernet/ti/cpsw_ale.c | 84 +- drivers/net/ethernet/ti/cpsw_ale.h | 1 + 2 files changed, 83 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index a94aef3f54a5..766197003971 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -46,6 +46,29 @@ #define AM65_CPSW_ALE_THREAD_DEF_REG 0x134 +enum { + CPSW_ALE_F_STATUS_REG = BIT(0), /* Status register present */ + CPSW_ALE_F_HW_AUTOAGING = BIT(1), /* HW auto aging */ + + CPSW_ALE_F_COUNT +}; + +/** + * struct ale_dev_id - The ALE version/SoC specific configuration + * @dev_id: ALE version/SoC id + * @features: features supported by ALE + * @tbl_entries: number of ALE entries + * @major_ver_mask: mask of ALE Major Version Value in ALE_IDVER reg. + * @nu_switch_ale: NU Switch ALE + */ +struct cpsw_ale_dev_id { + const char *dev_id; + u32 features; + u32 tbl_entries; + u32 major_ver_mask; + bool nu_switch_ale; +}; + #define ALE_TABLE_WRITEBIT(31) #define ALE_TYPE_FREE 0 @@ -979,11 +1002,70 @@ void cpsw_ale_stop(struct cpsw_ale *ale) cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0); } +static const struct cpsw_ale_dev_id cpsw_ale_id_match[] = { + { + /* am3/4/5, dra7. dm814x, 66ak2hk-gbe */ + .dev_id = "cpsw", + .tbl_entries = 1024, + .major_ver_mask = 0xff, + }, + { + /* 66ak2h_xgbe */ + .dev_id = "66ak2h-xgbe", + .tbl_entries = 2048, + .major_ver_mask = 0xff, + }, + { + .dev_id = "66ak2el", + .features = CPSW_ALE_F_STATUS_REG, + .major_ver_mask = 0x7, + .nu_switch_ale = true, + }, + { + .dev_id = "66ak2g", + .features = CPSW_ALE_F_STATUS_REG, + .tbl_entries = 64, + .major_ver_mask = 0x7, + .nu_switch_ale = true, + }, + { + .dev_id = "am65x-cpsw2g", + .features = CPSW_ALE_F_STATUS_REG | CPSW_ALE_F_HW_AUTOAGING, + .tbl_entries = 64, + .major_ver_mask = 0x7, + .nu_switch_ale = true, + }, + { }, +}; + +static const struct +cpsw_ale_dev_id *cpsw_ale_match_id(const struct cpsw_ale_dev_id *id, + const char *dev_id) +{ + if (!dev_id) + return NULL; + + while (id->dev_id) { + if (strcmp(dev_id, id->dev_id) == 0) + return id; + id++; + } + return NULL; +} + struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) { + const struct cpsw_ale_dev_id *ale_dev_id; struct cpsw_ale *ale; u32 rev, ale_entries; + ale_dev_id = cpsw_ale_match_id(cpsw_ale_id_match, params->dev_id); + if (ale_dev_id) { + params->ale_entries = ale_dev_id->tbl_entries; + params->major_ver_mask = ale_dev_id->major_ver_mask; + params->nu_switch_ale = ale_dev_id->nu_switch_ale; + } + ale = devm_kzalloc(params->dev, sizeof(*ale), GFP_KERNEL); if (!ale) return ERR_PTR(-ENOMEM); @@ -999,8 +1081,6 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) ale->ageout = ale->params.ale_ageout * HZ; rev = readl_relaxed(ale->params.ale_regs + ALE_IDVER); - if (!ale->params.major_ver_mask) - ale->params.major_ver_mask = 0xff; ale->version = (ALE_VERSION_MAJOR(rev, ale->params.major_ver_mask) << 8) | ALE_VERSION_MINOR(rev); diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h index 735692f066bf..53ad4246617e