Re: [PATCH 08/12] pinctrl: axp209: account for const type of of_device_id.data
On Tue, Jan 2, 2018 at 2:28 PM, Julia Lawallwrote: > The return value of of_device_get_match_data has type const void *. > The desc field of the pctl structure also has a const type, so there > is no need for the const-discarding cast between them. > > Done using Coccinelle. > > Signed-off-by: Julia Lawall Patch applied. Yours, Linus Walleij
Re: [PATCH 08/12] pinctrl: axp209: account for const type of of_device_id.data
On Tue, Jan 2, 2018 at 2:28 PM, Julia Lawall wrote: > The return value of of_device_get_match_data has type const void *. > The desc field of the pctl structure also has a const type, so there > is no need for the const-discarding cast between them. > > Done using Coccinelle. > > Signed-off-by: Julia Lawall Patch applied. Yours, Linus Walleij
Re: [PATCH] ethernet: mlx4: Delete an error message for a failed memory allocation in five functions
On 01/01/2018 10:46 PM, SF Markus Elfring wrote: From: Markus ElfringDate: Mon, 1 Jan 2018 21:42:27 +0100 Omit an extra message for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- Is this an issue? Why? What is your motivation? These are error messages, very informative, appear only upon errors, and in control flow.
Re: [PATCH] ethernet: mlx4: Delete an error message for a failed memory allocation in five functions
On 01/01/2018 10:46 PM, SF Markus Elfring wrote: From: Markus Elfring Date: Mon, 1 Jan 2018 21:42:27 +0100 Omit an extra message for a memory allocation failure in these functions. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- Is this an issue? Why? What is your motivation? These are error messages, very informative, appear only upon errors, and in control flow.
Re: [PATCH 02/12] pinctrl: at91-pio4: account for const type of of_device_id.data
On Tue, Jan 2, 2018 at 2:27 PM, Julia Lawallwrote: > This driver creates a const structure that it stores in the data field > of an of_device_id array. > > Adding const to the declaration of the location that receives the > const value from the data field ensures that the compiler will > continue to check that the value is not modified. Furthermore, the > const-discarding cast on the extraction from the data field is no > longer needed. > > Done using Coccinelle. > > Signed-off-by: Julia Lawall Patch applied. Yours, Linus Walleij
Re: [PATCH 02/12] pinctrl: at91-pio4: account for const type of of_device_id.data
On Tue, Jan 2, 2018 at 2:27 PM, Julia Lawall wrote: > This driver creates a const structure that it stores in the data field > of an of_device_id array. > > Adding const to the declaration of the location that receives the > const value from the data field ensures that the compiler will > continue to check that the value is not modified. Furthermore, the > const-discarding cast on the extraction from the data field is no > longer needed. > > Done using Coccinelle. > > Signed-off-by: Julia Lawall Patch applied. Yours, Linus Walleij
Re: [PATCH 16/67] powerpc: rename dma_direct_ to dma_nommu_
Hi Michael, On Wed, Jan 3, 2018 at 7:24 AM, Michael Ellermanwrote: > Geert Uytterhoeven writes: > >> On Tue, Jan 2, 2018 at 10:45 AM, Michael Ellerman >> wrote: >>> Christoph Hellwig writes: >>> We want to use the dma_direct_ namespace for a generic implementation, so rename powerpc to the second best choice: dma_nommu_. >>> >>> I'm not a fan of "nommu". Some of the users of direct ops *are* using an >>> IOMMU, they're just setting up a 1:1 mapping once at init time, rather >>> than mapping dynamically. >>> >>> Though I don't have a good idea for a better name, maybe "1to1", >>> "linear", "premapped" ? >> >> "identity"? > > I think that would be wrong, but thanks for trying to help :) > > The address on the device side is sometimes (often?) offset from the CPU > address. So eg. the device can DMA to RAM address 0x0 using address > 0x800. > > Identity would imply 0 == 0 etc. > > I think "bijective" is the correct term, but that's probably a bit > esoteric. OK, didn't know about the offset. Then "linear" is what we tend to use, right? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH 16/67] powerpc: rename dma_direct_ to dma_nommu_
Hi Michael, On Wed, Jan 3, 2018 at 7:24 AM, Michael Ellerman wrote: > Geert Uytterhoeven writes: > >> On Tue, Jan 2, 2018 at 10:45 AM, Michael Ellerman >> wrote: >>> Christoph Hellwig writes: >>> We want to use the dma_direct_ namespace for a generic implementation, so rename powerpc to the second best choice: dma_nommu_. >>> >>> I'm not a fan of "nommu". Some of the users of direct ops *are* using an >>> IOMMU, they're just setting up a 1:1 mapping once at init time, rather >>> than mapping dynamically. >>> >>> Though I don't have a good idea for a better name, maybe "1to1", >>> "linear", "premapped" ? >> >> "identity"? > > I think that would be wrong, but thanks for trying to help :) > > The address on the device side is sometimes (often?) offset from the CPU > address. So eg. the device can DMA to RAM address 0x0 using address > 0x800. > > Identity would imply 0 == 0 etc. > > I think "bijective" is the correct term, but that's probably a bit > esoteric. OK, didn't know about the offset. Then "linear" is what we tend to use, right? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v3 18/27] pinctrl: replace devm_ioremap_nocache with devm_ioremap
On Wed, Jan 3, 2018 at 7:15 AM, Yisheng Xiewrote: > On 2018/1/2 16:43, Linus Walleij wrote: >> On Sat, Dec 23, 2017 at 12:00 PM, Yisheng Xie wrote: >> >>> Default ioremap is ioremap_nocache, so devm_ioremap has the same >>> function with devm_ioremap_nocache, which can just be killed to >>> save the size of devres.o >>> >>> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >>> which should not have any function change but prepare for killing >>> devm_ioremap_nocache. >>> >>> Cc: Linus Walleij >>> Cc: linux-g...@vger.kernel.org >>> Signed-off-by: Yisheng Xie >> >> Patch applied. > > Well, I list the ARCHs related to the change file, do not include > cris,ia64,mn10300 > and openrisc, which ioremap is not the same as ioremap_nocache, as discussed > in cover > letter. So please let me know if I need update the comment. Yeah, same comment as the GPIO patch. Yours, Linus Walleij
Re: [PATCH v3 18/27] pinctrl: replace devm_ioremap_nocache with devm_ioremap
On Wed, Jan 3, 2018 at 7:15 AM, Yisheng Xie wrote: > On 2018/1/2 16:43, Linus Walleij wrote: >> On Sat, Dec 23, 2017 at 12:00 PM, Yisheng Xie wrote: >> >>> Default ioremap is ioremap_nocache, so devm_ioremap has the same >>> function with devm_ioremap_nocache, which can just be killed to >>> save the size of devres.o >>> >>> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >>> which should not have any function change but prepare for killing >>> devm_ioremap_nocache. >>> >>> Cc: Linus Walleij >>> Cc: linux-g...@vger.kernel.org >>> Signed-off-by: Yisheng Xie >> >> Patch applied. > > Well, I list the ARCHs related to the change file, do not include > cris,ia64,mn10300 > and openrisc, which ioremap is not the same as ioremap_nocache, as discussed > in cover > letter. So please let me know if I need update the comment. Yeah, same comment as the GPIO patch. Yours, Linus Walleij
Re: [PATCH v3 06/27] gpio: replace devm_ioremap_nocache with devm_ioremap
On Wed, Jan 3, 2018 at 7:05 AM, Yisheng Xiewrote: > On 2018/1/2 16:41, Linus Walleij wrote: >> On Sat, Dec 23, 2017 at 11:58 AM, Yisheng Xie wrote: >> >>> Default ioremap is ioremap_nocache, so devm_ioremap has the same >>> function with devm_ioremap_nocache, which can just be killed to >>> save the size of devres.o >>> >>> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >>> which should not have any function change but prepare for killing >>> devm_ioremap_nocache. >>> >>> Cc: Linus Walleij >>> Cc: linux-g...@vger.kernel.org >>> Signed-off-by: Yisheng Xie > > Well, I list the ARCHs related to the change file, do not include > cris,ia64,mn10300 > and openrisc, which ioremap is not the same as ioremap_nocache, as discussed > in cover > letter. So please let me know if I need update the comment. I dropped the patch until it's figured out that none of these arches are affected by the change. Please resend with a comment explaining why the change is harmless on the architectures these drivers are for. Yours, Linus Walleij
Re: [PATCH v3 06/27] gpio: replace devm_ioremap_nocache with devm_ioremap
On Wed, Jan 3, 2018 at 7:05 AM, Yisheng Xie wrote: > On 2018/1/2 16:41, Linus Walleij wrote: >> On Sat, Dec 23, 2017 at 11:58 AM, Yisheng Xie wrote: >> >>> Default ioremap is ioremap_nocache, so devm_ioremap has the same >>> function with devm_ioremap_nocache, which can just be killed to >>> save the size of devres.o >>> >>> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >>> which should not have any function change but prepare for killing >>> devm_ioremap_nocache. >>> >>> Cc: Linus Walleij >>> Cc: linux-g...@vger.kernel.org >>> Signed-off-by: Yisheng Xie > > Well, I list the ARCHs related to the change file, do not include > cris,ia64,mn10300 > and openrisc, which ioremap is not the same as ioremap_nocache, as discussed > in cover > letter. So please let me know if I need update the comment. I dropped the patch until it's figured out that none of these arches are affected by the change. Please resend with a comment explaining why the change is harmless on the architectures these drivers are for. Yours, Linus Walleij
Re: general protection fault in __netlink_ns_capable
On Tue, Jan 02, 2018 at 04:35:11PM -0800, Andrei Vagin wrote: > On Tue, Jan 02, 2018 at 10:58:01AM -0800, syzbot wrote: > > Hello, > > > > syzkaller hit the following crash on > > 75aa5540627fdb3d8f86229776ea87f995275351 > > git://git.cmpxchg.org/linux-mmots.git/master > > compiler: gcc (GCC) 7.1.1 20170620 > > .config is attached > > Raw console output is attached. > > C reproducer is attached > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > > for information about syzkaller reproducers > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+e432865c29eb4c48c...@syzkaller.appspotmail.com > > It will help syzbot understand when the bug is fixed. See footer for > > details. > > If you forward the report, please keep this part and the footer. > > > > netlink: 3 bytes leftover after parsing attributes in process > > `syzkaller140561'. > > netlink: 3 bytes leftover after parsing attributes in process > > `syzkaller140561'. > > netlink: 3 bytes leftover after parsing attributes in process > > `syzkaller140561'. > > kasan: CONFIG_KASAN_INLINE enabled > > kasan: GPF could be caused by NULL-ptr deref or user memory access > > general protection fault: [#1] SMP KASAN > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 1 PID: 3149 Comm: syzkaller140561 Not tainted 4.15.0-rc4-mm1+ #47 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > RIP: 0010:__netlink_ns_capable+0x8b/0x120 net/netlink/af_netlink.c:868 > > NETLINK_CB(skb).sk is NULL here. It looks like we have to use > sk_ns_capable instead of netlink_ns_capable: > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index c688dc564b11..408c75de52ea 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -1762,7 +1762,7 @@ static struct net *get_target_net(struct sk_buff > *skb, int netnsid) > /* For now, the caller is required to have CAP_NET_ADMIN in > * the user namespace owning the target net ns. > */ > - if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) { > + if (!sk_ns_capable(skb->sk, net->user_ns, CAP_NET_ADMIN)) { > put_net(net); > return ERR_PTR(-EACCES); > } > get_target_net() is used twice in the code. In rtnl_getlink(), we need to use netlink_ns_capable(skb, ...), but in rtnl_dump_ifinfo, we need to use sk_ns_capable(skb->sk, ...). Pls, take a look at this patch: https://patchwork.ozlabs.org/patch/854896/ Subject: rtnetlink: give a user socket to get_target_net()
Re: general protection fault in __netlink_ns_capable
On Tue, Jan 02, 2018 at 04:35:11PM -0800, Andrei Vagin wrote: > On Tue, Jan 02, 2018 at 10:58:01AM -0800, syzbot wrote: > > Hello, > > > > syzkaller hit the following crash on > > 75aa5540627fdb3d8f86229776ea87f995275351 > > git://git.cmpxchg.org/linux-mmots.git/master > > compiler: gcc (GCC) 7.1.1 20170620 > > .config is attached > > Raw console output is attached. > > C reproducer is attached > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > > for information about syzkaller reproducers > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+e432865c29eb4c48c...@syzkaller.appspotmail.com > > It will help syzbot understand when the bug is fixed. See footer for > > details. > > If you forward the report, please keep this part and the footer. > > > > netlink: 3 bytes leftover after parsing attributes in process > > `syzkaller140561'. > > netlink: 3 bytes leftover after parsing attributes in process > > `syzkaller140561'. > > netlink: 3 bytes leftover after parsing attributes in process > > `syzkaller140561'. > > kasan: CONFIG_KASAN_INLINE enabled > > kasan: GPF could be caused by NULL-ptr deref or user memory access > > general protection fault: [#1] SMP KASAN > > Dumping ftrace buffer: > >(ftrace buffer empty) > > Modules linked in: > > CPU: 1 PID: 3149 Comm: syzkaller140561 Not tainted 4.15.0-rc4-mm1+ #47 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > RIP: 0010:__netlink_ns_capable+0x8b/0x120 net/netlink/af_netlink.c:868 > > NETLINK_CB(skb).sk is NULL here. It looks like we have to use > sk_ns_capable instead of netlink_ns_capable: > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index c688dc564b11..408c75de52ea 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -1762,7 +1762,7 @@ static struct net *get_target_net(struct sk_buff > *skb, int netnsid) > /* For now, the caller is required to have CAP_NET_ADMIN in > * the user namespace owning the target net ns. > */ > - if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) { > + if (!sk_ns_capable(skb->sk, net->user_ns, CAP_NET_ADMIN)) { > put_net(net); > return ERR_PTR(-EACCES); > } > get_target_net() is used twice in the code. In rtnl_getlink(), we need to use netlink_ns_capable(skb, ...), but in rtnl_dump_ifinfo, we need to use sk_ns_capable(skb->sk, ...). Pls, take a look at this patch: https://patchwork.ozlabs.org/patch/854896/ Subject: rtnetlink: give a user socket to get_target_net()
Re: [PATCH v3] f2fs: add an ioctl to disable GC for specific file
On 2018/1/3 11:21, Jaegeuk Kim wrote: > This patch gives a flag to disable GC on given file, which would be useful, > when > user wants to keep its block map. It also conducts in-place-update for > dontmove > file. > > Signed-off-by: Jaegeuk Kim> --- > > Change log from v2: > - modify ioctl to allow users unpin the file > > fs/f2fs/data.c | 2 ++ > fs/f2fs/f2fs.h | 28 +- > fs/f2fs/file.c | 64 > + > fs/f2fs/gc.c| 11 + > fs/f2fs/gc.h| 2 ++ > fs/f2fs/sysfs.c | 2 ++ > include/linux/f2fs_fs.h | 9 ++- > 7 files changed, 116 insertions(+), 2 deletions(-) > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index 449b0aaa3905..45f65a5b9871 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1395,6 +1395,8 @@ static inline bool need_inplace_update(struct > f2fs_io_info *fio) > { > struct inode *inode = fio->page->mapping->host; > > + if (f2fs_is_pinned_file(inode)) > + return true; > if (S_ISDIR(inode->i_mode) || f2fs_is_atomic_file(inode)) > return false; > if (is_cold_data(fio->page)) > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > index a0e8eec23125..f4b7d73695a7 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -350,6 +350,7 @@ static inline bool __has_cursum_space(struct f2fs_journal > *journal, > #define F2FS_IOC_GARBAGE_COLLECT_RANGE _IOW(F2FS_IOCTL_MAGIC, 11, > \ > struct f2fs_gc_range) > #define F2FS_IOC_GET_FEATURES_IOR(F2FS_IOCTL_MAGIC, 12, > __u32) > +#define F2FS_IOC_SET_PIN_FILE_IOW(F2FS_IOCTL_MAGIC, 13, > __u32) > > #define F2FS_IOC_SET_ENCRYPTION_POLICY FS_IOC_SET_ENCRYPTION_POLICY > #define F2FS_IOC_GET_ENCRYPTION_POLICY FS_IOC_GET_ENCRYPTION_POLICY > @@ -587,7 +588,10 @@ struct f2fs_inode_info { > unsigned long i_flags; /* keep an inode flags for ioctl */ > unsigned char i_advise; /* use to give file attribute hints */ > unsigned char i_dir_level; /* use for dentry level for large dir */ > - unsigned int i_current_depth; /* use only in directory structure */ > + union { > + unsigned int i_current_depth; /* only for directory depth */ > + unsigned short i_gc_failures; /* only for regular file */ > + }; > unsigned int i_pino;/* parent inode number */ > umode_t i_acl_mode; /* keep file acl mode temporarily */ > > @@ -1133,6 +1137,9 @@ struct f2fs_sb_info { > /* threshold for converting bg victims for fg */ > u64 fggc_threshold; > > + /* threshold for gc trials on pinned files */ > + u64 gc_pin_file_threshold; > + > /* maximum # of trials to find a victim segment for SSR and GC */ > unsigned int max_victim_search; > > @@ -2124,6 +2131,7 @@ enum { > FI_HOT_DATA,/* indicate file is hot */ > FI_EXTRA_ATTR, /* indicate file has extra attribute */ > FI_PROJ_INHERIT,/* indicate file inherits projectid */ > + FI_PIN_FILE,/* indicate file should not be gced */ > }; > > static inline void __mark_inode_dirty_flag(struct inode *inode, > @@ -2137,6 +2145,7 @@ static inline void __mark_inode_dirty_flag(struct inode > *inode, > return; > case FI_DATA_EXIST: > case FI_INLINE_DOTS: > + case FI_PIN_FILE: > f2fs_mark_inode_dirty_sync(inode, true); > } > } > @@ -2217,6 +2226,13 @@ static inline void f2fs_i_depth_write(struct inode > *inode, unsigned int depth) > f2fs_mark_inode_dirty_sync(inode, true); > } > > +static inline void f2fs_i_gc_failures_write(struct inode *inode, > + unsigned int count) > +{ > + F2FS_I(inode)->i_gc_failures = count; > + f2fs_mark_inode_dirty_sync(inode, true); > +} > + > static inline void f2fs_i_xnid_write(struct inode *inode, nid_t xnid) > { > F2FS_I(inode)->i_xattr_nid = xnid; > @@ -2245,6 +2261,8 @@ static inline void get_inline_info(struct inode *inode, > struct f2fs_inode *ri) > set_bit(FI_INLINE_DOTS, >flags); > if (ri->i_inline & F2FS_EXTRA_ATTR) > set_bit(FI_EXTRA_ATTR, >flags); > + if (ri->i_inline & F2FS_PIN_FILE) > + set_bit(FI_PIN_FILE, >flags); > } > > static inline void set_raw_inline(struct inode *inode, struct f2fs_inode *ri) > @@ -2263,6 +2281,8 @@ static inline void set_raw_inline(struct inode *inode, > struct f2fs_inode *ri) > ri->i_inline |= F2FS_INLINE_DOTS; > if (is_inode_flag_set(inode, FI_EXTRA_ATTR)) > ri->i_inline |= F2FS_EXTRA_ATTR; > + if (is_inode_flag_set(inode, FI_PIN_FILE)) > + ri->i_inline |= F2FS_PIN_FILE; > } > > static inline int f2fs_has_extra_attr(struct
Re: [PATCH v3] f2fs: add an ioctl to disable GC for specific file
On 2018/1/3 11:21, Jaegeuk Kim wrote: > This patch gives a flag to disable GC on given file, which would be useful, > when > user wants to keep its block map. It also conducts in-place-update for > dontmove > file. > > Signed-off-by: Jaegeuk Kim > --- > > Change log from v2: > - modify ioctl to allow users unpin the file > > fs/f2fs/data.c | 2 ++ > fs/f2fs/f2fs.h | 28 +- > fs/f2fs/file.c | 64 > + > fs/f2fs/gc.c| 11 + > fs/f2fs/gc.h| 2 ++ > fs/f2fs/sysfs.c | 2 ++ > include/linux/f2fs_fs.h | 9 ++- > 7 files changed, 116 insertions(+), 2 deletions(-) > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index 449b0aaa3905..45f65a5b9871 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1395,6 +1395,8 @@ static inline bool need_inplace_update(struct > f2fs_io_info *fio) > { > struct inode *inode = fio->page->mapping->host; > > + if (f2fs_is_pinned_file(inode)) > + return true; > if (S_ISDIR(inode->i_mode) || f2fs_is_atomic_file(inode)) > return false; > if (is_cold_data(fio->page)) > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > index a0e8eec23125..f4b7d73695a7 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -350,6 +350,7 @@ static inline bool __has_cursum_space(struct f2fs_journal > *journal, > #define F2FS_IOC_GARBAGE_COLLECT_RANGE _IOW(F2FS_IOCTL_MAGIC, 11, > \ > struct f2fs_gc_range) > #define F2FS_IOC_GET_FEATURES_IOR(F2FS_IOCTL_MAGIC, 12, > __u32) > +#define F2FS_IOC_SET_PIN_FILE_IOW(F2FS_IOCTL_MAGIC, 13, > __u32) > > #define F2FS_IOC_SET_ENCRYPTION_POLICY FS_IOC_SET_ENCRYPTION_POLICY > #define F2FS_IOC_GET_ENCRYPTION_POLICY FS_IOC_GET_ENCRYPTION_POLICY > @@ -587,7 +588,10 @@ struct f2fs_inode_info { > unsigned long i_flags; /* keep an inode flags for ioctl */ > unsigned char i_advise; /* use to give file attribute hints */ > unsigned char i_dir_level; /* use for dentry level for large dir */ > - unsigned int i_current_depth; /* use only in directory structure */ > + union { > + unsigned int i_current_depth; /* only for directory depth */ > + unsigned short i_gc_failures; /* only for regular file */ > + }; > unsigned int i_pino;/* parent inode number */ > umode_t i_acl_mode; /* keep file acl mode temporarily */ > > @@ -1133,6 +1137,9 @@ struct f2fs_sb_info { > /* threshold for converting bg victims for fg */ > u64 fggc_threshold; > > + /* threshold for gc trials on pinned files */ > + u64 gc_pin_file_threshold; > + > /* maximum # of trials to find a victim segment for SSR and GC */ > unsigned int max_victim_search; > > @@ -2124,6 +2131,7 @@ enum { > FI_HOT_DATA,/* indicate file is hot */ > FI_EXTRA_ATTR, /* indicate file has extra attribute */ > FI_PROJ_INHERIT,/* indicate file inherits projectid */ > + FI_PIN_FILE,/* indicate file should not be gced */ > }; > > static inline void __mark_inode_dirty_flag(struct inode *inode, > @@ -2137,6 +2145,7 @@ static inline void __mark_inode_dirty_flag(struct inode > *inode, > return; > case FI_DATA_EXIST: > case FI_INLINE_DOTS: > + case FI_PIN_FILE: > f2fs_mark_inode_dirty_sync(inode, true); > } > } > @@ -2217,6 +2226,13 @@ static inline void f2fs_i_depth_write(struct inode > *inode, unsigned int depth) > f2fs_mark_inode_dirty_sync(inode, true); > } > > +static inline void f2fs_i_gc_failures_write(struct inode *inode, > + unsigned int count) > +{ > + F2FS_I(inode)->i_gc_failures = count; > + f2fs_mark_inode_dirty_sync(inode, true); > +} > + > static inline void f2fs_i_xnid_write(struct inode *inode, nid_t xnid) > { > F2FS_I(inode)->i_xattr_nid = xnid; > @@ -2245,6 +2261,8 @@ static inline void get_inline_info(struct inode *inode, > struct f2fs_inode *ri) > set_bit(FI_INLINE_DOTS, >flags); > if (ri->i_inline & F2FS_EXTRA_ATTR) > set_bit(FI_EXTRA_ATTR, >flags); > + if (ri->i_inline & F2FS_PIN_FILE) > + set_bit(FI_PIN_FILE, >flags); > } > > static inline void set_raw_inline(struct inode *inode, struct f2fs_inode *ri) > @@ -2263,6 +2281,8 @@ static inline void set_raw_inline(struct inode *inode, > struct f2fs_inode *ri) > ri->i_inline |= F2FS_INLINE_DOTS; > if (is_inode_flag_set(inode, FI_EXTRA_ATTR)) > ri->i_inline |= F2FS_EXTRA_ATTR; > + if (is_inode_flag_set(inode, FI_PIN_FILE)) > + ri->i_inline |= F2FS_PIN_FILE; > } > > static inline int f2fs_has_extra_attr(struct inode *inode) > @@
Re: [PATCH 04/13] powerpc/powernv: Add platform-specific services for opencapi
On 19/12/17 02:21, Frederic Barrat wrote: Implement a few platform-specific calls which can be used by drivers: - provide the Transaction Layer capabilities of the host, so that the driver can find some common ground and configure the device and host appropriately. - provide the hw interrupt to be used for translation faults raised by the NPU - map/unmap some NPU mmio registers to get the fault context when the NPU raises an address translation fault The rest are wrappers around the previously-introduced opal calls. Signed-off-by: Frederic Barrat--- arch/powerpc/include/asm/pnv-ocxl.h | 36 ++ arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/ocxl.c | 187 3 files changed, 224 insertions(+) create mode 100644 arch/powerpc/include/asm/pnv-ocxl.h create mode 100644 arch/powerpc/platforms/powernv/ocxl.c diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h new file mode 100644 index ..b9ab3f0a9634 --- /dev/null +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -0,0 +1,36 @@ +/* + * Copyright 2017 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _ASM_PVN_OCXL_H +#define _ASM_PVN_OCXL_H I assume you meant "PNV" here. + +#include + +#define PNV_OCXL_TL_MAX_TEMPLATE63 +#define PNV_OCXL_TL_BITS_PER_RATE 4 +#define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8) + +extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, + char *rate_buf, int rate_buf_size); +extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, + uint64_t rate_buf_phys, int rate_buf_size); + +extern int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq); +extern void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar, + void __iomem *tfc, void __iomem *pe_handle); +extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, + void __iomem **dar, void __iomem **tfc, + void __iomem **pe_handle); + +extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, + void **platform_data); +extern void pnv_ocxl_spa_release(void *platform_data); +extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); + +#endif /* _ASM_PVN_OCXL_H */ And here diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index 3732118a0482..6c9d5199a7e2 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_PERF_EVENTS) += opal-imc.o obj-$(CONFIG_PPC_MEMTRACE)+= memtrace.o obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o vas-debug.o obj-$(CONFIG_PPC_FTW) += nx-ftw.o +obj-$(CONFIG_OCXL_BASE)+= ocxl.o diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c new file mode 100644 index ..3378b75cf5e5 --- /dev/null +++ b/arch/powerpc/platforms/powernv/ocxl.c +int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq) +{ + int rc; + + rc = of_property_read_u32(dev->dev.of_node, "ibm,opal-xsl-irq", hwirq); + if (rc) { + dev_err(>dev, + "Can't translation xsl interrupt for device\n"); Can't get? -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [PATCH 04/13] powerpc/powernv: Add platform-specific services for opencapi
On 19/12/17 02:21, Frederic Barrat wrote: Implement a few platform-specific calls which can be used by drivers: - provide the Transaction Layer capabilities of the host, so that the driver can find some common ground and configure the device and host appropriately. - provide the hw interrupt to be used for translation faults raised by the NPU - map/unmap some NPU mmio registers to get the fault context when the NPU raises an address translation fault The rest are wrappers around the previously-introduced opal calls. Signed-off-by: Frederic Barrat --- arch/powerpc/include/asm/pnv-ocxl.h | 36 ++ arch/powerpc/platforms/powernv/Makefile | 1 + arch/powerpc/platforms/powernv/ocxl.c | 187 3 files changed, 224 insertions(+) create mode 100644 arch/powerpc/include/asm/pnv-ocxl.h create mode 100644 arch/powerpc/platforms/powernv/ocxl.c diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h new file mode 100644 index ..b9ab3f0a9634 --- /dev/null +++ b/arch/powerpc/include/asm/pnv-ocxl.h @@ -0,0 +1,36 @@ +/* + * Copyright 2017 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _ASM_PVN_OCXL_H +#define _ASM_PVN_OCXL_H I assume you meant "PNV" here. + +#include + +#define PNV_OCXL_TL_MAX_TEMPLATE63 +#define PNV_OCXL_TL_BITS_PER_RATE 4 +#define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8) + +extern int pnv_ocxl_get_tl_cap(struct pci_dev *dev, long *cap, + char *rate_buf, int rate_buf_size); +extern int pnv_ocxl_set_tl_conf(struct pci_dev *dev, long cap, + uint64_t rate_buf_phys, int rate_buf_size); + +extern int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq); +extern void pnv_ocxl_unmap_xsl_regs(void __iomem *dsisr, void __iomem *dar, + void __iomem *tfc, void __iomem *pe_handle); +extern int pnv_ocxl_map_xsl_regs(struct pci_dev *dev, void __iomem **dsisr, + void __iomem **dar, void __iomem **tfc, + void __iomem **pe_handle); + +extern int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, + void **platform_data); +extern void pnv_ocxl_spa_release(void *platform_data); +extern int pnv_ocxl_spa_remove_pe(void *platform_data, int pe_handle); + +#endif /* _ASM_PVN_OCXL_H */ And here diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index 3732118a0482..6c9d5199a7e2 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_PERF_EVENTS) += opal-imc.o obj-$(CONFIG_PPC_MEMTRACE)+= memtrace.o obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o vas-debug.o obj-$(CONFIG_PPC_FTW) += nx-ftw.o +obj-$(CONFIG_OCXL_BASE)+= ocxl.o diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c new file mode 100644 index ..3378b75cf5e5 --- /dev/null +++ b/arch/powerpc/platforms/powernv/ocxl.c +int pnv_ocxl_get_xsl_irq(struct pci_dev *dev, int *hwirq) +{ + int rc; + + rc = of_property_read_u32(dev->dev.of_node, "ibm,opal-xsl-irq", hwirq); + if (rc) { + dev_err(>dev, + "Can't translation xsl interrupt for device\n"); Can't get? -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [PATCH 06/13] ocxl: Driver code for 'generic' opencapi devices
On 19/12/17 02:21, Frederic Barrat wrote: Add an ocxl driver to handle generic opencapi devices. Of course, it's not meant to be the only opencapi driver, any device is free to implement its own. But if a host application only needs basic services like attaching to an opencapi adapter, have translation faults handled or allocate AFU interrupts, it should suffice. The AFU config space must follow the opencapi specification and use the expected vendor/device ID to be seen by the generic driver. The driver exposes the device AFUs as a char device in /dev/ocxl/ Note that the driver currently doesn't handle memory attached to the opencapi device. Signed-off-by: Frederic BarratSigned-off-by: Andrew Donnellan Signed-off-by: Alastair D'Silva A bunch of sparse warnings we should look at. (there's a few more that appear in later patches too) --- drivers/misc/ocxl/config.c| 718 ++ drivers/misc/ocxl/context.c | 237 + drivers/misc/ocxl/file.c | 405 + drivers/misc/ocxl/link.c | 610 drivers/misc/ocxl/main.c | 40 +++ drivers/misc/ocxl/ocxl_internal.h | 200 +++ drivers/misc/ocxl/pasid.c | 114 ++ drivers/misc/ocxl/pci.c | 592 +++ drivers/misc/ocxl/sysfs.c | 150 include/uapi/misc/ocxl.h | 47 +++ 10 files changed, 3113 insertions(+) create mode 100644 drivers/misc/ocxl/config.c create mode 100644 drivers/misc/ocxl/context.c create mode 100644 drivers/misc/ocxl/file.c create mode 100644 drivers/misc/ocxl/link.c create mode 100644 drivers/misc/ocxl/main.c create mode 100644 drivers/misc/ocxl/ocxl_internal.h create mode 100644 drivers/misc/ocxl/pasid.c create mode 100644 drivers/misc/ocxl/pci.c create mode 100644 drivers/misc/ocxl/sysfs.c create mode 100644 include/uapi/misc/ocxl.h diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c new file mode 100644 index ..bb2fde5967e2 --- /dev/null +++ b/drivers/misc/ocxl/config.c @@ -0,0 +1,718 @@ +/* + * Copyright 2017 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include "ocxl_internal.h" + +#define EXTRACT_BIT(val, bit) (!!(val & BIT(bit))) +#define EXTRACT_BITS(val, s, e) ((val & GENMASK(e, s)) >> s) + +#define OCXL_DVSEC_AFU_IDX_MASK GENMASK(5, 0) +#define OCXL_DVSEC_ACTAG_MASKGENMASK(11, 0) +#define OCXL_DVSEC_PASID_MASKGENMASK(19, 0) +#define OCXL_DVSEC_PASID_LOG_MASKGENMASK(4, 0) + +#define OCXL_DVSEC_TEMPL_VERSION 0x0 +#define OCXL_DVSEC_TEMPL_NAME0x4 +#define OCXL_DVSEC_TEMPL_AFU_VERSION 0x1C +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL 0x20 +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL_SZ 0x28 +#define OCXL_DVSEC_TEMPL_MMIO_PP 0x30 +#define OCXL_DVSEC_TEMPL_MMIO_PP_SZ 0x38 +#define OCXL_DVSEC_TEMPL_MEM_SZ 0x3C +#define OCXL_DVSEC_TEMPL_WWID0x40 + +#define OCXL_MAX_AFU_PER_FUNCTION 64 +#define OCXL_TEMPL_LEN0x58 +#define OCXL_TEMPL_NAME_LEN 24 +#define OCXL_CFG_TIMEOUT 3 + +static int find_dvsec(struct pci_dev *dev, int dvsec_id) +{ + int vsec = 0; + u16 vendor, id; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + if (vendor == PCI_VENDOR_ID_IBM && id == dvsec_id) + return vsec; + } + return 0; +} + +static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) +{ + int vsec = 0; + u16 vendor, id; + u8 idx; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + + if (vendor == PCI_VENDOR_ID_IBM && + id == OCXL_DVSEC_AFU_CTRL_ID) { + pci_read_config_byte(dev, + vsec + OCXL_DVSEC_AFU_CTRL_AFU_IDX, + ); + if (idx == afu_idx) + return vsec; + } + } + return 0; +} + +static int
Re: [PATCH 06/13] ocxl: Driver code for 'generic' opencapi devices
On 19/12/17 02:21, Frederic Barrat wrote: Add an ocxl driver to handle generic opencapi devices. Of course, it's not meant to be the only opencapi driver, any device is free to implement its own. But if a host application only needs basic services like attaching to an opencapi adapter, have translation faults handled or allocate AFU interrupts, it should suffice. The AFU config space must follow the opencapi specification and use the expected vendor/device ID to be seen by the generic driver. The driver exposes the device AFUs as a char device in /dev/ocxl/ Note that the driver currently doesn't handle memory attached to the opencapi device. Signed-off-by: Frederic Barrat Signed-off-by: Andrew Donnellan Signed-off-by: Alastair D'Silva A bunch of sparse warnings we should look at. (there's a few more that appear in later patches too) --- drivers/misc/ocxl/config.c| 718 ++ drivers/misc/ocxl/context.c | 237 + drivers/misc/ocxl/file.c | 405 + drivers/misc/ocxl/link.c | 610 drivers/misc/ocxl/main.c | 40 +++ drivers/misc/ocxl/ocxl_internal.h | 200 +++ drivers/misc/ocxl/pasid.c | 114 ++ drivers/misc/ocxl/pci.c | 592 +++ drivers/misc/ocxl/sysfs.c | 150 include/uapi/misc/ocxl.h | 47 +++ 10 files changed, 3113 insertions(+) create mode 100644 drivers/misc/ocxl/config.c create mode 100644 drivers/misc/ocxl/context.c create mode 100644 drivers/misc/ocxl/file.c create mode 100644 drivers/misc/ocxl/link.c create mode 100644 drivers/misc/ocxl/main.c create mode 100644 drivers/misc/ocxl/ocxl_internal.h create mode 100644 drivers/misc/ocxl/pasid.c create mode 100644 drivers/misc/ocxl/pci.c create mode 100644 drivers/misc/ocxl/sysfs.c create mode 100644 include/uapi/misc/ocxl.h diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c new file mode 100644 index ..bb2fde5967e2 --- /dev/null +++ b/drivers/misc/ocxl/config.c @@ -0,0 +1,718 @@ +/* + * Copyright 2017 IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include "ocxl_internal.h" + +#define EXTRACT_BIT(val, bit) (!!(val & BIT(bit))) +#define EXTRACT_BITS(val, s, e) ((val & GENMASK(e, s)) >> s) + +#define OCXL_DVSEC_AFU_IDX_MASK GENMASK(5, 0) +#define OCXL_DVSEC_ACTAG_MASKGENMASK(11, 0) +#define OCXL_DVSEC_PASID_MASKGENMASK(19, 0) +#define OCXL_DVSEC_PASID_LOG_MASKGENMASK(4, 0) + +#define OCXL_DVSEC_TEMPL_VERSION 0x0 +#define OCXL_DVSEC_TEMPL_NAME0x4 +#define OCXL_DVSEC_TEMPL_AFU_VERSION 0x1C +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL 0x20 +#define OCXL_DVSEC_TEMPL_MMIO_GLOBAL_SZ 0x28 +#define OCXL_DVSEC_TEMPL_MMIO_PP 0x30 +#define OCXL_DVSEC_TEMPL_MMIO_PP_SZ 0x38 +#define OCXL_DVSEC_TEMPL_MEM_SZ 0x3C +#define OCXL_DVSEC_TEMPL_WWID0x40 + +#define OCXL_MAX_AFU_PER_FUNCTION 64 +#define OCXL_TEMPL_LEN0x58 +#define OCXL_TEMPL_NAME_LEN 24 +#define OCXL_CFG_TIMEOUT 3 + +static int find_dvsec(struct pci_dev *dev, int dvsec_id) +{ + int vsec = 0; + u16 vendor, id; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + if (vendor == PCI_VENDOR_ID_IBM && id == dvsec_id) + return vsec; + } + return 0; +} + +static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx) +{ + int vsec = 0; + u16 vendor, id; + u8 idx; + + while ((vsec = pci_find_next_ext_capability(dev, vsec, + OCXL_EXT_CAP_ID_DVSEC))) { + pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET, + ); + pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, ); + + if (vendor == PCI_VENDOR_ID_IBM && + id == OCXL_DVSEC_AFU_CTRL_ID) { + pci_read_config_byte(dev, + vsec + OCXL_DVSEC_AFU_CTRL_AFU_IDX, + ); + if (idx == afu_idx) + return vsec; + } + } + return 0; +} + +static int read_pasid(struct pci_dev *dev, struct ocxl_fn_config *fn) +{ + u16 val;
[PATCHv4 1/2] capability: introduce sysctl for controlled user-ns capability whitelist
From: Mahesh BandewarAdd a sysctl variable kernel.controlled_userns_caps_whitelist. Capability mask is stored in kernel as kernel_cap_t type (array of u32). This sysctl takes input as comma separated hex u32 words. For simplicity one could see this sysctl to operate on string inputs. However the value is not expected to change that often during the life of a kernel-boot. It makes more sense to use the widely available API instead of bringing another string manipulation for the purpose of making this simpler. The default value set (for kernel.controlled_userns_caps_whitelist) is CAP_FULL_SET indicating that no capability is controlled by default to maintain compatibility with the existing behavior of user-ns. Administrator will have to modify this sysctl to control any capability as such. e.g. to control CAP_NET_RAW the mask need to be changed like - # sysctl -q kernel.controlled_userns_caps_whitelist kernel.controlled_userns_caps_whitelist = 1f, # sysctl -w kernel.controlled_userns_caps_whitelist=1f,dfff kernel.controlled_userns_caps_whitelist = 1f,dfff For bit-to-mask conversion please check include/uapi/linux/capability.h file. Any capabilities that are not part of this mask will be controlled and will not be allowed to processes in controlled user-ns. In above example CAP_NET_RAW will not be available to controlled-user-namespaces. Acked-by: Serge Hallyn Signed-off-by: Mahesh Bandewar --- v4: commit message changes. v3: Added couple of comments as requested by Serge Hallyn v2: Rebase v1: Initial submission Documentation/sysctl/kernel.txt | 21 ++ include/linux/capability.h | 3 +++ kernel/capability.c | 47 + kernel/sysctl.c | 5 + 4 files changed, 76 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 694968c7523c..6aa1e087afee 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -25,6 +25,7 @@ show up in /proc/sys/kernel: - bootloader_version[ X86 only ] - callhome [ S390 only ] - cap_last_cap +- controlled_userns_caps_whitelist - core_pattern - core_pipe_limit - core_uses_pid @@ -187,6 +188,26 @@ CAP_LAST_CAP from the kernel. == +controlled_userns_caps_whitelist + +Capability mask that is whitelisted for "controlled" user namespaces. +Any capability that is missing from this mask will not be allowed to +any process that is attached to a controlled-userns. e.g. if CAP_NET_RAW +is not part of this mask, then processes running inside any controlled +userns's will not be allowed to perform action that needs CAP_NET_RAW +capability. However, processes that are attached to a parent user-ns +hierarchy that is *not* controlled and has CAP_NET_RAW can continue +performing those actions. User-namespaces are marked "controlled" at +the time of their creation based on the capabilities of the creator. +A process that does not have CAP_SYS_ADMIN will create user-namespaces +that are controlled. + +The value is expressed as two comma separated hex words (u32). This +sysctl is available in init-ns and users with CAP_SYS_ADMIN in init-ns +are allowed to make changes. + +== + core_pattern: core_pattern is used to specify a core dumpfile pattern name. diff --git a/include/linux/capability.h b/include/linux/capability.h index f640dcbc880c..7d79a4689625 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -14,6 +14,7 @@ #define _LINUX_CAPABILITY_H #include +#include #define _KERNEL_CAPABILITY_VERSION _LINUX_CAPABILITY_VERSION_3 @@ -248,6 +249,8 @@ extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); /* audit system wants to get cap info from files as well */ extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps); +int proc_douserns_caps_whitelist(struct ctl_table *table, int write, +void __user *buff, size_t *lenp, loff_t *ppos); extern int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size); diff --git a/kernel/capability.c b/kernel/capability.c index 1e1c0236f55b..4a859b7d4902 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -29,6 +29,8 @@ EXPORT_SYMBOL(__cap_empty_set); int file_caps_enabled = 1; +kernel_cap_t controlled_userns_caps_whitelist = CAP_FULL_SET; + static int __init file_caps_disable(char *str) { file_caps_enabled = 0; @@ -507,3 +509,48 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns) rcu_read_unlock(); return (ret == 0); } + +/* Controlled-userns capabilities routines */ +#ifdef CONFIG_SYSCTL +int proc_douserns_caps_whitelist(struct
[PATCHv4 2/2] userns: control capabilities of some user namespaces
From: Mahesh BandewarWith this new notion of "controlled" user-namespaces, the controlled user-namespaces are marked at the time of their creation while the capabilities of processes that belong to them are controlled using the global mask. Init-user-ns is always uncontrolled and a process that has SYS_ADMIN that belongs to uncontrolled user-ns can create another (child) user- namespace that is uncontrolled. Any other process (that either does not have SYS_ADMIN or belongs to a controlled user-ns) can only create a user-ns that is controlled. global-capability-whitelist (controlled_userns_caps_whitelist) is used at the capability check-time and keeps the semantics for the processes that belong to uncontrolled user-ns as it is. Processes that belong to controlled user-ns however are subjected to different checks- (a) if the capability in question is controlled and process belongs to controlled user-ns, then it's always denied. (b) if the capability in question is NOT controlled then fall back to the traditional check. Acked-by: Serge Hallyn Signed-off-by: Mahesh Bandewar --- v4: Rebase v3: Rebase v2: Don't recalculate user-ns flags for every setns() call. v1: Initial submission. include/linux/capability.h | 4 include/linux/user_namespace.h | 25 + kernel/capability.c| 5 + kernel/user_namespace.c| 4 security/commoncap.c | 8 5 files changed, 46 insertions(+) diff --git a/include/linux/capability.h b/include/linux/capability.h index 7d79a4689625..383f31f066f0 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -251,6 +251,10 @@ extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps); int proc_douserns_caps_whitelist(struct ctl_table *table, int write, void __user *buff, size_t *lenp, loff_t *ppos); +/* Controlled capability is capability that is missing from the capability-mask + * controlled_userns_caps_whitelist controlled via sysctl. + */ +bool is_capability_controlled(int cap); extern int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size); diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index d6b74b91096b..a5c48684b317 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -32,6 +32,7 @@ struct uid_gid_map { /* 64 bytes -- 1 cache line */ }; #define USERNS_SETGROUPS_ALLOWED 1UL +#define USERNS_CONTROLLED 2UL #define USERNS_INIT_FLAGS USERNS_SETGROUPS_ALLOWED @@ -112,6 +113,21 @@ static inline void put_user_ns(struct user_namespace *ns) __put_user_ns(ns); } +/* Controlled user-ns is the one that is created by a process that does not + * have CAP_SYS_ADMIN (or descended from such an user-ns). + * For more details please see the sysctl description of + * controlled_userns_caps_whitelist. + */ +static inline bool is_user_ns_controlled(const struct user_namespace *ns) +{ + return ns->flags & USERNS_CONTROLLED; +} + +static inline void mark_user_ns_controlled(struct user_namespace *ns) +{ + ns->flags |= USERNS_CONTROLLED; +} + struct seq_operations; extern const struct seq_operations proc_uid_seq_operations; extern const struct seq_operations proc_gid_seq_operations; @@ -170,6 +186,15 @@ static inline struct ns_common *ns_get_owner(struct ns_common *ns) { return ERR_PTR(-EPERM); } + +static inline bool is_user_ns_controlled(const struct user_namespace *ns) +{ + return false; +} + +static inline void mark_user_ns_controlled(struct user_namespace *ns) +{ +} #endif #endif /* _LINUX_USER_H */ diff --git a/kernel/capability.c b/kernel/capability.c index 4a859b7d4902..bffe249922de 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -511,6 +511,11 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns) } /* Controlled-userns capabilities routines */ +bool is_capability_controlled(int cap) +{ + return !cap_raised(controlled_userns_caps_whitelist, cap); +} + #ifdef CONFIG_SYSCTL int proc_douserns_caps_whitelist(struct ctl_table *table, int write, void __user *buff, size_t *lenp, loff_t *ppos) diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 246d4d4ce5c7..ca0556d466b6 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -141,6 +141,10 @@ int create_user_ns(struct cred *new) goto fail_keyring; set_cred_user_ns(new, ns); + if (!ns_capable(parent_ns, CAP_SYS_ADMIN) || + is_user_ns_controlled(parent_ns)) + mark_user_ns_controlled(ns); + return 0; fail_keyring: #ifdef CONFIG_PERSISTENT_KEYRINGS diff --git a/security/commoncap.c
[PATCHv4 1/2] capability: introduce sysctl for controlled user-ns capability whitelist
From: Mahesh Bandewar Add a sysctl variable kernel.controlled_userns_caps_whitelist. Capability mask is stored in kernel as kernel_cap_t type (array of u32). This sysctl takes input as comma separated hex u32 words. For simplicity one could see this sysctl to operate on string inputs. However the value is not expected to change that often during the life of a kernel-boot. It makes more sense to use the widely available API instead of bringing another string manipulation for the purpose of making this simpler. The default value set (for kernel.controlled_userns_caps_whitelist) is CAP_FULL_SET indicating that no capability is controlled by default to maintain compatibility with the existing behavior of user-ns. Administrator will have to modify this sysctl to control any capability as such. e.g. to control CAP_NET_RAW the mask need to be changed like - # sysctl -q kernel.controlled_userns_caps_whitelist kernel.controlled_userns_caps_whitelist = 1f, # sysctl -w kernel.controlled_userns_caps_whitelist=1f,dfff kernel.controlled_userns_caps_whitelist = 1f,dfff For bit-to-mask conversion please check include/uapi/linux/capability.h file. Any capabilities that are not part of this mask will be controlled and will not be allowed to processes in controlled user-ns. In above example CAP_NET_RAW will not be available to controlled-user-namespaces. Acked-by: Serge Hallyn Signed-off-by: Mahesh Bandewar --- v4: commit message changes. v3: Added couple of comments as requested by Serge Hallyn v2: Rebase v1: Initial submission Documentation/sysctl/kernel.txt | 21 ++ include/linux/capability.h | 3 +++ kernel/capability.c | 47 + kernel/sysctl.c | 5 + 4 files changed, 76 insertions(+) diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 694968c7523c..6aa1e087afee 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -25,6 +25,7 @@ show up in /proc/sys/kernel: - bootloader_version[ X86 only ] - callhome [ S390 only ] - cap_last_cap +- controlled_userns_caps_whitelist - core_pattern - core_pipe_limit - core_uses_pid @@ -187,6 +188,26 @@ CAP_LAST_CAP from the kernel. == +controlled_userns_caps_whitelist + +Capability mask that is whitelisted for "controlled" user namespaces. +Any capability that is missing from this mask will not be allowed to +any process that is attached to a controlled-userns. e.g. if CAP_NET_RAW +is not part of this mask, then processes running inside any controlled +userns's will not be allowed to perform action that needs CAP_NET_RAW +capability. However, processes that are attached to a parent user-ns +hierarchy that is *not* controlled and has CAP_NET_RAW can continue +performing those actions. User-namespaces are marked "controlled" at +the time of their creation based on the capabilities of the creator. +A process that does not have CAP_SYS_ADMIN will create user-namespaces +that are controlled. + +The value is expressed as two comma separated hex words (u32). This +sysctl is available in init-ns and users with CAP_SYS_ADMIN in init-ns +are allowed to make changes. + +== + core_pattern: core_pattern is used to specify a core dumpfile pattern name. diff --git a/include/linux/capability.h b/include/linux/capability.h index f640dcbc880c..7d79a4689625 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -14,6 +14,7 @@ #define _LINUX_CAPABILITY_H #include +#include #define _KERNEL_CAPABILITY_VERSION _LINUX_CAPABILITY_VERSION_3 @@ -248,6 +249,8 @@ extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); /* audit system wants to get cap info from files as well */ extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps); +int proc_douserns_caps_whitelist(struct ctl_table *table, int write, +void __user *buff, size_t *lenp, loff_t *ppos); extern int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size); diff --git a/kernel/capability.c b/kernel/capability.c index 1e1c0236f55b..4a859b7d4902 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -29,6 +29,8 @@ EXPORT_SYMBOL(__cap_empty_set); int file_caps_enabled = 1; +kernel_cap_t controlled_userns_caps_whitelist = CAP_FULL_SET; + static int __init file_caps_disable(char *str) { file_caps_enabled = 0; @@ -507,3 +509,48 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns) rcu_read_unlock(); return (ret == 0); } + +/* Controlled-userns capabilities routines */ +#ifdef CONFIG_SYSCTL +int proc_douserns_caps_whitelist(struct ctl_table *table, int write, +
[PATCHv4 2/2] userns: control capabilities of some user namespaces
From: Mahesh Bandewar With this new notion of "controlled" user-namespaces, the controlled user-namespaces are marked at the time of their creation while the capabilities of processes that belong to them are controlled using the global mask. Init-user-ns is always uncontrolled and a process that has SYS_ADMIN that belongs to uncontrolled user-ns can create another (child) user- namespace that is uncontrolled. Any other process (that either does not have SYS_ADMIN or belongs to a controlled user-ns) can only create a user-ns that is controlled. global-capability-whitelist (controlled_userns_caps_whitelist) is used at the capability check-time and keeps the semantics for the processes that belong to uncontrolled user-ns as it is. Processes that belong to controlled user-ns however are subjected to different checks- (a) if the capability in question is controlled and process belongs to controlled user-ns, then it's always denied. (b) if the capability in question is NOT controlled then fall back to the traditional check. Acked-by: Serge Hallyn Signed-off-by: Mahesh Bandewar --- v4: Rebase v3: Rebase v2: Don't recalculate user-ns flags for every setns() call. v1: Initial submission. include/linux/capability.h | 4 include/linux/user_namespace.h | 25 + kernel/capability.c| 5 + kernel/user_namespace.c| 4 security/commoncap.c | 8 5 files changed, 46 insertions(+) diff --git a/include/linux/capability.h b/include/linux/capability.h index 7d79a4689625..383f31f066f0 100644 --- a/include/linux/capability.h +++ b/include/linux/capability.h @@ -251,6 +251,10 @@ extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps); int proc_douserns_caps_whitelist(struct ctl_table *table, int write, void __user *buff, size_t *lenp, loff_t *ppos); +/* Controlled capability is capability that is missing from the capability-mask + * controlled_userns_caps_whitelist controlled via sysctl. + */ +bool is_capability_controlled(int cap); extern int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size); diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index d6b74b91096b..a5c48684b317 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -32,6 +32,7 @@ struct uid_gid_map { /* 64 bytes -- 1 cache line */ }; #define USERNS_SETGROUPS_ALLOWED 1UL +#define USERNS_CONTROLLED 2UL #define USERNS_INIT_FLAGS USERNS_SETGROUPS_ALLOWED @@ -112,6 +113,21 @@ static inline void put_user_ns(struct user_namespace *ns) __put_user_ns(ns); } +/* Controlled user-ns is the one that is created by a process that does not + * have CAP_SYS_ADMIN (or descended from such an user-ns). + * For more details please see the sysctl description of + * controlled_userns_caps_whitelist. + */ +static inline bool is_user_ns_controlled(const struct user_namespace *ns) +{ + return ns->flags & USERNS_CONTROLLED; +} + +static inline void mark_user_ns_controlled(struct user_namespace *ns) +{ + ns->flags |= USERNS_CONTROLLED; +} + struct seq_operations; extern const struct seq_operations proc_uid_seq_operations; extern const struct seq_operations proc_gid_seq_operations; @@ -170,6 +186,15 @@ static inline struct ns_common *ns_get_owner(struct ns_common *ns) { return ERR_PTR(-EPERM); } + +static inline bool is_user_ns_controlled(const struct user_namespace *ns) +{ + return false; +} + +static inline void mark_user_ns_controlled(struct user_namespace *ns) +{ +} #endif #endif /* _LINUX_USER_H */ diff --git a/kernel/capability.c b/kernel/capability.c index 4a859b7d4902..bffe249922de 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -511,6 +511,11 @@ bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns) } /* Controlled-userns capabilities routines */ +bool is_capability_controlled(int cap) +{ + return !cap_raised(controlled_userns_caps_whitelist, cap); +} + #ifdef CONFIG_SYSCTL int proc_douserns_caps_whitelist(struct ctl_table *table, int write, void __user *buff, size_t *lenp, loff_t *ppos) diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 246d4d4ce5c7..ca0556d466b6 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -141,6 +141,10 @@ int create_user_ns(struct cred *new) goto fail_keyring; set_cred_user_ns(new, ns); + if (!ns_capable(parent_ns, CAP_SYS_ADMIN) || + is_user_ns_controlled(parent_ns)) + mark_user_ns_controlled(ns); + return 0; fail_keyring: #ifdef CONFIG_PERSISTENT_KEYRINGS diff --git a/security/commoncap.c b/security/commoncap.c index 4f8e09340956..5454e9c03ee8 100644 ---
[PATCHv4 0/2] capability controlled user-namespaces
From: Mahesh BandewarTL;DR version - Creating a sandbox environment with namespaces is challenging considering what these sandboxed processes can engage into. e.g. CVE-2017-6074, CVE-2017-7184, CVE-2017-7308 etc. just to name few. Current form of user-namespaces, however, if changed a bit can allow us to create a sandbox environment without locking down user- namespaces. Detailed version Problem --- User-namespaces in the current form have increased the attack surface as any process can acquire capabilities which are not available to them (by default) by performing combination of clone()/unshare()/setns() syscalls. #define _GNU_SOURCE #include #include #include int main(int ac, char **av) { int sock = -1; printf("Attempting to open RAW socket before unshare()...\n"); sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW); if (sock < 0) { perror("socket() SOCK_RAW failed: "); } else { printf("Successfully opened RAW-Sock before unshare().\n"); close(sock); sock = -1; } if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) { perror("unshare() failed: "); return 1; } printf("Attempting to open RAW socket after unshare()...\n"); sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW); if (sock < 0) { perror("socket() SOCK_RAW failed: "); } else { printf("Successfully opened RAW-Sock after unshare().\n"); close(sock); sock = -1; } return 0; } The above example shows how easy it is to acquire NET_RAW capabilities and once acquired, these processes could take benefit of above mentioned or similar issues discovered/undiscovered with malicious intent. Note that this is just an example and the problem/solution is not limited to NET_RAW capability *only*. The easiest fix one can apply here is to lock-down user-namespaces which many of the distros do (i.e. don't allow users to create user namespaces), but unfortunately that prevents everyone from using them. Approach Introduce a notion of 'controlled' user-namespaces. Every process on the host is allowed to create user-namespaces (governed by the limit imposed by per-ns sysctl) however, mark user-namespaces created by sandboxed processes as 'controlled'. Use this 'mark' at the time of capability check in conjunction with a global capability whitelist. If the capability is not whitelisted, processes that belong to controlled user-namespaces will not be allowed. Processes that do not have CAP_SYS_ADMIN in init-ns can *only* create controlled user-namespaces. In other words, user-namespaces created by privileged processes (those which have CAP_SYS_ADMIN in init-ns) are not controlled. A hierarchy underneath any controlled user-ns is always controlled. A global whitelist is list of capabilities governed by a sysctl (kernel.controlled_userns_caps_whitelist) which is available to (privileged) user in init-ns to modify while it's applicable to all controlled user-namespaces on the host irrespective of when that user-ns was created. Marking user-namespaces controlled without modifying the whitelist is equivalent of the current behavior. The default value of whitelist includes all capabilities so that the compatibility is maintained. However it gives admins fine-grained ability to control various capabilities system wide without locking down user-namespaces. Example --- Here is the example that demonstrates the behavior of a kernel that has this patch-set applied. It uses the same c-code from this commit-log and is called acquire_raw.c - (a) The 'root' user has all the capabilities all the time (before and after taking capability). root@vm0:~# id uid=0(root) gid=0(root) groups=0(root) root@vm0:~# sysctl -q kernel.controlled_userns_caps_whitelist kernel.controlled_userns_caps_whitelist = 1f, root@vm0:~# ./acquire_raw Attempting to open RAW socket before unshare()... Successfully opened RAW-Sock before unshare(). Attempting to open RAW socket after unshare()... Successfully opened RAW-Sock after unshare(). root@vm0:~# sysctl -w kernel.controlled_userns_caps_whitelist=1f,dfff kernel.controlled_userns_caps_whitelist = 1f,dfff root@vm0:~# ./acquire_raw Attempting to open RAW socket before unshare()... Successfully opened RAW-Sock before unshare(). Attempting to open RAW socket after unshare()... Successfully opened RAW-Sock after unshare(). (b) Unprivileged user cannot change the mask. mahesh@vm0:~$ id uid=1000(mahesh) gid=1000(mahesh) groups=1000(mahesh),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),118(lpadmin),128(sambashare) mahesh@vm0:~$ sysctl -q kernel.controlled_userns_caps_whitelist kernel.controlled_userns_caps_whitelist = 1f,
[PATCHv4 0/2] capability controlled user-namespaces
From: Mahesh Bandewar TL;DR version - Creating a sandbox environment with namespaces is challenging considering what these sandboxed processes can engage into. e.g. CVE-2017-6074, CVE-2017-7184, CVE-2017-7308 etc. just to name few. Current form of user-namespaces, however, if changed a bit can allow us to create a sandbox environment without locking down user- namespaces. Detailed version Problem --- User-namespaces in the current form have increased the attack surface as any process can acquire capabilities which are not available to them (by default) by performing combination of clone()/unshare()/setns() syscalls. #define _GNU_SOURCE #include #include #include int main(int ac, char **av) { int sock = -1; printf("Attempting to open RAW socket before unshare()...\n"); sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW); if (sock < 0) { perror("socket() SOCK_RAW failed: "); } else { printf("Successfully opened RAW-Sock before unshare().\n"); close(sock); sock = -1; } if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) { perror("unshare() failed: "); return 1; } printf("Attempting to open RAW socket after unshare()...\n"); sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW); if (sock < 0) { perror("socket() SOCK_RAW failed: "); } else { printf("Successfully opened RAW-Sock after unshare().\n"); close(sock); sock = -1; } return 0; } The above example shows how easy it is to acquire NET_RAW capabilities and once acquired, these processes could take benefit of above mentioned or similar issues discovered/undiscovered with malicious intent. Note that this is just an example and the problem/solution is not limited to NET_RAW capability *only*. The easiest fix one can apply here is to lock-down user-namespaces which many of the distros do (i.e. don't allow users to create user namespaces), but unfortunately that prevents everyone from using them. Approach Introduce a notion of 'controlled' user-namespaces. Every process on the host is allowed to create user-namespaces (governed by the limit imposed by per-ns sysctl) however, mark user-namespaces created by sandboxed processes as 'controlled'. Use this 'mark' at the time of capability check in conjunction with a global capability whitelist. If the capability is not whitelisted, processes that belong to controlled user-namespaces will not be allowed. Processes that do not have CAP_SYS_ADMIN in init-ns can *only* create controlled user-namespaces. In other words, user-namespaces created by privileged processes (those which have CAP_SYS_ADMIN in init-ns) are not controlled. A hierarchy underneath any controlled user-ns is always controlled. A global whitelist is list of capabilities governed by a sysctl (kernel.controlled_userns_caps_whitelist) which is available to (privileged) user in init-ns to modify while it's applicable to all controlled user-namespaces on the host irrespective of when that user-ns was created. Marking user-namespaces controlled without modifying the whitelist is equivalent of the current behavior. The default value of whitelist includes all capabilities so that the compatibility is maintained. However it gives admins fine-grained ability to control various capabilities system wide without locking down user-namespaces. Example --- Here is the example that demonstrates the behavior of a kernel that has this patch-set applied. It uses the same c-code from this commit-log and is called acquire_raw.c - (a) The 'root' user has all the capabilities all the time (before and after taking capability). root@vm0:~# id uid=0(root) gid=0(root) groups=0(root) root@vm0:~# sysctl -q kernel.controlled_userns_caps_whitelist kernel.controlled_userns_caps_whitelist = 1f, root@vm0:~# ./acquire_raw Attempting to open RAW socket before unshare()... Successfully opened RAW-Sock before unshare(). Attempting to open RAW socket after unshare()... Successfully opened RAW-Sock after unshare(). root@vm0:~# sysctl -w kernel.controlled_userns_caps_whitelist=1f,dfff kernel.controlled_userns_caps_whitelist = 1f,dfff root@vm0:~# ./acquire_raw Attempting to open RAW socket before unshare()... Successfully opened RAW-Sock before unshare(). Attempting to open RAW socket after unshare()... Successfully opened RAW-Sock after unshare(). (b) Unprivileged user cannot change the mask. mahesh@vm0:~$ id uid=1000(mahesh) gid=1000(mahesh) groups=1000(mahesh),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),118(lpadmin),128(sambashare) mahesh@vm0:~$ sysctl -q kernel.controlled_userns_caps_whitelist kernel.controlled_userns_caps_whitelist = 1f, mahesh@vm0:~$
Re: [PATCH V8 0/3] OPP: Allow OPP table to be used for power-domains
On 18-12-17, 15:51, Viresh Kumar wrote: > Hi, > > Now that the performance state of PM domains are supported by the kernel > (merged in linux-next), I am trying once again to define the bindings > which we dropped until the code is merged first. > > Summary: > > Power-domains can also have their active states and this patchset > enhances the OPP binding to define those. > > The power domains can use the OPP bindings mostly as is. Though there > are some changes required to support special cases: > > - Allow "operating-points-v2" to contain multiple phandles for power > domain providers providing multiple domains. > > - A new property "required-opp" is added for the devices to specify the > minimum required OPP of the master domain or any other type of device. > > - Allow some of the OPP properties to accept magic values (firmware > dependent) as the OS doesn't know the real freq/voltage values. > > V7->V8: > - V7 1/2 divided into two patches. 1/3 is unchanged from V7. > - 2/3 renamed the property from "power-domain-opp" to "required-opp", as > suggested by Rob. > - Added Ulf's reviewed-by for 1/3 and 3/3. > > -- > viresh > > Viresh Kumar (3): > OPP: Allow OPP table to be used for power-domains > OPP: Introduce "required-opp" property > OPP: Allow "opp-hz" and "opp-microvolt" to contain magic values Discussions are still going on for the last commit, though the first two are already Acked by Rob and Ulf and are quite independent of the third one. Any objections to getting the first two merged for 4.16-rc1 ? I will send them to Rafael on Friday if no one objects. -- viresh
Re: [PATCH V8 0/3] OPP: Allow OPP table to be used for power-domains
On 18-12-17, 15:51, Viresh Kumar wrote: > Hi, > > Now that the performance state of PM domains are supported by the kernel > (merged in linux-next), I am trying once again to define the bindings > which we dropped until the code is merged first. > > Summary: > > Power-domains can also have their active states and this patchset > enhances the OPP binding to define those. > > The power domains can use the OPP bindings mostly as is. Though there > are some changes required to support special cases: > > - Allow "operating-points-v2" to contain multiple phandles for power > domain providers providing multiple domains. > > - A new property "required-opp" is added for the devices to specify the > minimum required OPP of the master domain or any other type of device. > > - Allow some of the OPP properties to accept magic values (firmware > dependent) as the OS doesn't know the real freq/voltage values. > > V7->V8: > - V7 1/2 divided into two patches. 1/3 is unchanged from V7. > - 2/3 renamed the property from "power-domain-opp" to "required-opp", as > suggested by Rob. > - Added Ulf's reviewed-by for 1/3 and 3/3. > > -- > viresh > > Viresh Kumar (3): > OPP: Allow OPP table to be used for power-domains > OPP: Introduce "required-opp" property > OPP: Allow "opp-hz" and "opp-microvolt" to contain magic values Discussions are still going on for the last commit, though the first two are already Acked by Rob and Ulf and are quite independent of the third one. Any objections to getting the first two merged for 4.16-rc1 ? I will send them to Rafael on Friday if no one objects. -- viresh
Re: [Intel-gfx] [PATCH v2] drm/i915: Try EDID bitbanging on HDMI after failed read
On Tue, 02 Jan 2018, Chris Wilsonwrote: > Quoting Rodrigo Vivi (2018-01-02 19:12:18) >> On Sun, Dec 31, 2017 at 10:34:54PM +, Stefan Brüns wrote: >> > + edid = drm_get_edid(connector, i2c); >> > + >> > + if (!edid && !intel_gmbus_is_forced_bit(i2c)) { >> > + DRM_DEBUG_KMS("HDMI GMBUS EDID read failed, retry using GPIO >> > bit-banging\n"); >> > + intel_gmbus_force_bit(i2c, true); >> > + edid = drm_get_edid(connector, i2c); >> > + intel_gmbus_force_bit(i2c, false); >> > + } >> >> Approach seems fine for this case. >> I just wonder what would be the risks of forcing this bit and edid read when >> nothing is present on the other end? > > Should be no more risky than using GMBUS as the bit-banging is the > underlying HW protocol; it should just be adding an extra delay to > the disconnected probe. Offset against the chance that it fixes > detection of borderline devices. > > I would say that given the explanation above, the question is why not > apply it universally? (Bonus points for including the explanation as > comments.) I'm wondering, is gmbus too fast for the adapters, does gmbus generally have different timing for the ack/nak as described in the commit message than bit banging, or are the adapters just plain buggy? Do we have any control over gmbus timings (don't have the time to peruse the bpsec just now)? BR, Jani. > -Chris > ___ > Intel-gfx mailing list > intel-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center
Re: [Intel-gfx] [PATCH v2] drm/i915: Try EDID bitbanging on HDMI after failed read
On Tue, 02 Jan 2018, Chris Wilson wrote: > Quoting Rodrigo Vivi (2018-01-02 19:12:18) >> On Sun, Dec 31, 2017 at 10:34:54PM +, Stefan Brüns wrote: >> > + edid = drm_get_edid(connector, i2c); >> > + >> > + if (!edid && !intel_gmbus_is_forced_bit(i2c)) { >> > + DRM_DEBUG_KMS("HDMI GMBUS EDID read failed, retry using GPIO >> > bit-banging\n"); >> > + intel_gmbus_force_bit(i2c, true); >> > + edid = drm_get_edid(connector, i2c); >> > + intel_gmbus_force_bit(i2c, false); >> > + } >> >> Approach seems fine for this case. >> I just wonder what would be the risks of forcing this bit and edid read when >> nothing is present on the other end? > > Should be no more risky than using GMBUS as the bit-banging is the > underlying HW protocol; it should just be adding an extra delay to > the disconnected probe. Offset against the chance that it fixes > detection of borderline devices. > > I would say that given the explanation above, the question is why not > apply it universally? (Bonus points for including the explanation as > comments.) I'm wondering, is gmbus too fast for the adapters, does gmbus generally have different timing for the ack/nak as described in the commit message than bit banging, or are the adapters just plain buggy? Do we have any control over gmbus timings (don't have the time to peruse the bpsec just now)? BR, Jani. > -Chris > ___ > Intel-gfx mailing list > intel-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center
Re: 4.15-rc6 PTI regression: L1 TLB mismatch MCE on Athlon64
> > These MCE-s do not happen on 4.14 and 4.15.0-rc4-00041-gace52288edf0. > > They do happen on each boot into 4.15-rc6. Will try to bisect. > > Please do. And try -rc5 too. 4.15-rc5 is OK. Will try CONFIG_X86_PTDUMP on the next kernel. > And then Linus' pti merges: > > 52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd > 5aa90a84589282b87666f92b6c3c917c8080a9bf > caf9a82657b313106aae8f4a35936c116a152299 > 64a48099b3b31568ac45716b7fafcb74a0c2fcfe -- Meelis Roos (mr...@linux.ee)
Re: 4.15-rc6 PTI regression: L1 TLB mismatch MCE on Athlon64
> > These MCE-s do not happen on 4.14 and 4.15.0-rc4-00041-gace52288edf0. > > They do happen on each boot into 4.15-rc6. Will try to bisect. > > Please do. And try -rc5 too. 4.15-rc5 is OK. Will try CONFIG_X86_PTDUMP on the next kernel. > And then Linus' pti merges: > > 52c90f2d32bfa7d6eccd66a56c44ace1f78fbadd > 5aa90a84589282b87666f92b6c3c917c8080a9bf > caf9a82657b313106aae8f4a35936c116a152299 > 64a48099b3b31568ac45716b7fafcb74a0c2fcfe -- Meelis Roos (mr...@linux.ee)
Re: [PATCH] exec: Weaken dumpability for secureexec
On Tue, Jan 02, 2018 at 03:21:33PM -0800, Kees Cook wrote: > This is a logical revert of: > > commit e37fdb785a5f ("exec: Use secureexec for setting dumpability") > > This weakens dumpability back to checking only for uid/gid changes in > current (which is useless), but userspace depends on dumpability not > being tied to secureexec. > > https://bugzilla.redhat.com/show_bug.cgi?id=1528633 > > Reported-by: Tom Horsley> Fixes: e37fdb785a5f ("exec: Use secureexec for setting dumpability") > Cc: sta...@vger.kernel.org > Signed-off-by: Kees Cook > --- > fs/exec.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 5688b5e1b937..7eb8d21bcab9 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1349,9 +1349,14 @@ void setup_new_exec(struct linux_binprm * bprm) > > current->sas_ss_sp = current->sas_ss_size = 0; > > - /* Figure out dumpability. */ > + /* > + * Figure out dumpability. Note that this checking only of current > + * is wrong, but userspace depends on it. This should be testing > + * bprm->secureexec instead. > + */ > if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP || > - bprm->secureexec) > + !(uid_eq(current_euid(), current_uid()) && > + gid_eq(current_egid(), current_gid( So what about the pdeath_signal? Is that going to be another subtle time-bomb? > set_dumpable(current->mm, suid_dumpable); > else > set_dumpable(current->mm, SUID_DUMP_USER); > -- > 2.7.4 > > > -- > Kees Cook > Pixel Security
Re: [PATCH] exec: Weaken dumpability for secureexec
On Tue, Jan 02, 2018 at 03:21:33PM -0800, Kees Cook wrote: > This is a logical revert of: > > commit e37fdb785a5f ("exec: Use secureexec for setting dumpability") > > This weakens dumpability back to checking only for uid/gid changes in > current (which is useless), but userspace depends on dumpability not > being tied to secureexec. > > https://bugzilla.redhat.com/show_bug.cgi?id=1528633 > > Reported-by: Tom Horsley > Fixes: e37fdb785a5f ("exec: Use secureexec for setting dumpability") > Cc: sta...@vger.kernel.org > Signed-off-by: Kees Cook > --- > fs/exec.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 5688b5e1b937..7eb8d21bcab9 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1349,9 +1349,14 @@ void setup_new_exec(struct linux_binprm * bprm) > > current->sas_ss_sp = current->sas_ss_size = 0; > > - /* Figure out dumpability. */ > + /* > + * Figure out dumpability. Note that this checking only of current > + * is wrong, but userspace depends on it. This should be testing > + * bprm->secureexec instead. > + */ > if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP || > - bprm->secureexec) > + !(uid_eq(current_euid(), current_uid()) && > + gid_eq(current_egid(), current_gid( So what about the pdeath_signal? Is that going to be another subtle time-bomb? > set_dumpable(current->mm, suid_dumpable); > else > set_dumpable(current->mm, SUID_DUMP_USER); > -- > 2.7.4 > > > -- > Kees Cook > Pixel Security
Re: About the try to remove cross-release feature entirely by Ingo
On Wed, Jan 03, 2018 at 11:10:37AM +0900, Byungchul Park wrote: > > The point I was trying to drive home is that "all we have to do is > > just classify everything well or just invalidate the right lock > > Just to be sure, we don't have to invalidate lock objects at all but > a problematic waiter only. So essentially you are proposing that we have to play "whack-a-mole" as we find false positives, and where we may have to put in ad-hoc plumbing to only invalidate "a problematic waiter" when it's problematic --- or to entirely suppress the problematic waiter altogether. And in that case, a file system developer might be forced to invalidate a lock/"waiter"/"completion" in another subsystem. I will also remind you that doing this will trigger a checkpatch.pl *error*: ERROR("LOCKDEP", "lockdep_no_validate class is reserved for device->mutex.\n" . $herecurr); - Ted
Re: About the try to remove cross-release feature entirely by Ingo
On Wed, Jan 03, 2018 at 11:10:37AM +0900, Byungchul Park wrote: > > The point I was trying to drive home is that "all we have to do is > > just classify everything well or just invalidate the right lock > > Just to be sure, we don't have to invalidate lock objects at all but > a problematic waiter only. So essentially you are proposing that we have to play "whack-a-mole" as we find false positives, and where we may have to put in ad-hoc plumbing to only invalidate "a problematic waiter" when it's problematic --- or to entirely suppress the problematic waiter altogether. And in that case, a file system developer might be forced to invalidate a lock/"waiter"/"completion" in another subsystem. I will also remind you that doing this will trigger a checkpatch.pl *error*: ERROR("LOCKDEP", "lockdep_no_validate class is reserved for device->mutex.\n" . $herecurr); - Ted
Re: [PATCH] exec: Weaken dumpability for secureexec
On Tue, Jan 02, 2018 at 03:21:33PM -0800, Kees Cook wrote: > This is a logical revert of: > > commit e37fdb785a5f ("exec: Use secureexec for setting dumpability") > > This weakens dumpability back to checking only for uid/gid changes in > current (which is useless), but userspace depends on dumpability not > being tied to secureexec. > > https://bugzilla.redhat.com/show_bug.cgi?id=1528633 > > Reported-by: Tom HorsleySeems right, any chance we could get a tested-by: Tom? (Did we already get that?) > Fixes: e37fdb785a5f ("exec: Use secureexec for setting dumpability") > Cc: sta...@vger.kernel.org > Signed-off-by: Kees Cook > --- > fs/exec.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 5688b5e1b937..7eb8d21bcab9 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1349,9 +1349,14 @@ void setup_new_exec(struct linux_binprm * bprm) > > current->sas_ss_sp = current->sas_ss_size = 0; > > - /* Figure out dumpability. */ > + /* > + * Figure out dumpability. Note that this checking only of current > + * is wrong, but userspace depends on it. This should be testing > + * bprm->secureexec instead. > + */ > if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP || > - bprm->secureexec) > + !(uid_eq(current_euid(), current_uid()) && > + gid_eq(current_egid(), current_gid( > set_dumpable(current->mm, suid_dumpable); > else > set_dumpable(current->mm, SUID_DUMP_USER); > -- > 2.7.4 > > > -- > Kees Cook > Pixel Security
Re: [PATCH] exec: Weaken dumpability for secureexec
On Tue, Jan 02, 2018 at 03:21:33PM -0800, Kees Cook wrote: > This is a logical revert of: > > commit e37fdb785a5f ("exec: Use secureexec for setting dumpability") > > This weakens dumpability back to checking only for uid/gid changes in > current (which is useless), but userspace depends on dumpability not > being tied to secureexec. > > https://bugzilla.redhat.com/show_bug.cgi?id=1528633 > > Reported-by: Tom Horsley Seems right, any chance we could get a tested-by: Tom? (Did we already get that?) > Fixes: e37fdb785a5f ("exec: Use secureexec for setting dumpability") > Cc: sta...@vger.kernel.org > Signed-off-by: Kees Cook > --- > fs/exec.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/exec.c b/fs/exec.c > index 5688b5e1b937..7eb8d21bcab9 100644 > --- a/fs/exec.c > +++ b/fs/exec.c > @@ -1349,9 +1349,14 @@ void setup_new_exec(struct linux_binprm * bprm) > > current->sas_ss_sp = current->sas_ss_size = 0; > > - /* Figure out dumpability. */ > + /* > + * Figure out dumpability. Note that this checking only of current > + * is wrong, but userspace depends on it. This should be testing > + * bprm->secureexec instead. > + */ > if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP || > - bprm->secureexec) > + !(uid_eq(current_euid(), current_uid()) && > + gid_eq(current_egid(), current_gid( > set_dumpable(current->mm, suid_dumpable); > else > set_dumpable(current->mm, SUID_DUMP_USER); > -- > 2.7.4 > > > -- > Kees Cook > Pixel Security
Re: [PATCH] bonding: Delete an error message for a failed memory allocation in bond_update_slave_arr()
On Mon, Jan 1, 2018 at 8:07 AM, SF Markus Elfringwrote: > From: Markus Elfring > Date: Mon, 1 Jan 2018 17:00:04 +0100 > > Omit an extra message for a memory allocation failure in this function. > > This issue was detected by using the Coccinelle software. > What is the issue with this message? > Signed-off-by: Markus Elfring > --- > drivers/net/bonding/bond_main.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index c669554d70bb..a96e0c9cc4bf 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -3910,7 +3910,6 @@ int bond_update_slave_arr(struct bonding *bond, struct > slave *skipslave) > GFP_KERNEL); > if (!new_arr) { > ret = -ENOMEM; > - pr_err("Failed to build slave-array.\n"); > goto out; > } > if (BOND_MODE(bond) == BOND_MODE_8023AD) { > -- > 2.15.1 >
Re: [PATCH] bonding: Delete an error message for a failed memory allocation in bond_update_slave_arr()
On Mon, Jan 1, 2018 at 8:07 AM, SF Markus Elfring wrote: > From: Markus Elfring > Date: Mon, 1 Jan 2018 17:00:04 +0100 > > Omit an extra message for a memory allocation failure in this function. > > This issue was detected by using the Coccinelle software. > What is the issue with this message? > Signed-off-by: Markus Elfring > --- > drivers/net/bonding/bond_main.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index c669554d70bb..a96e0c9cc4bf 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -3910,7 +3910,6 @@ int bond_update_slave_arr(struct bonding *bond, struct > slave *skipslave) > GFP_KERNEL); > if (!new_arr) { > ret = -ENOMEM; > - pr_err("Failed to build slave-array.\n"); > goto out; > } > if (BOND_MODE(bond) == BOND_MODE_8023AD) { > -- > 2.15.1 >
Re: [PATCH] nokia N9: Add support for magnetometer and touchscreen
Hi, On 01/02/2018 06:27 PM, Sebastian Reichel wrote: > Hi, > > On Tue, Jan 02, 2018 at 02:17:22PM +0100, Pavel Machek wrote: >> This adds dts support for magnetometer and touchscreen on Nokia N9. > > I think it makes sense to have this splitted. > >> Signed-off-by: Pavel Machek>> >> diff --git a/arch/arm/boot/dts/omap3-n9.dts b/arch/arm/boot/dts/omap3-n9.dts >> index 39e35f8..57a6679 100644 >> --- a/arch/arm/boot/dts/omap3-n9.dts >> +++ b/arch/arm/boot/dts/omap3-n9.dts >> @@ -36,6 +57,22 @@ >> }; >> }; >> }; >> + >> +touch@4b { > > touchscreen@ > >> +compatible = "atmel,maxtouch"; >> +reg = <0x4b>; >> +interrupt-parent = <>; >> +interrupts = <29 2>; /* gpio_61, IRQF_TRIGGER_FALLING*/ > > reset-gpios = < 17 GPIO_ACTIVE_SOMETHING>; > I'm using reset-gpios = < 17 0>; >> +vdd-supply = <>; >> +avdd-supply = <>; > > Those two are not mentioned in the binding and not supported by the > driver as far as I can see? > Right, but vio and vaux1 need to be on - the reason why it's working at all is because lis302 uses the same regulators and turns them on. IMHO either we add the support for regulators to maxtouch driver or we add regulator-always-on to vio and vaux1. >> +}; >> +}; > > Touchscreen with the same settings is required for n950, so it > should be in the shared n950 + n9 file. > As a side-note, there is no pinmux mentioned and usually I'd use OMAP3_CORE1_IOPAD(0x20c8, PIN_INPUT | MUX_MODE4) /* gpio_61*/ OMAP3_CORE1_IOPAD(0x20f2, PIN_OUTPUT | MUX_MODE4) /* gpio_81*/ For reasons that I can't explain, first line (gpmc_nbe1->gpio_61) breaks it for me, so I've commented it out. Still, if anyone has an idea what is wrong with that please let me know. >> + { >> +ak8975@0f { >> +compatible = "asahi-kasei,ak8975"; >> +reg = <0x0f>; >> +}; >> }; > > Looking at the N9 board file this is missing a rotation matrix. This > is supported by the binding: > > Documentation/devicetree/bindings/iio/magnetometer/ak8975.txt > >> >> { > > -- Sebastian > Best regards, Filip
Re: [PATCH] nokia N9: Add support for magnetometer and touchscreen
Hi, On 01/02/2018 06:27 PM, Sebastian Reichel wrote: > Hi, > > On Tue, Jan 02, 2018 at 02:17:22PM +0100, Pavel Machek wrote: >> This adds dts support for magnetometer and touchscreen on Nokia N9. > > I think it makes sense to have this splitted. > >> Signed-off-by: Pavel Machek >> >> diff --git a/arch/arm/boot/dts/omap3-n9.dts b/arch/arm/boot/dts/omap3-n9.dts >> index 39e35f8..57a6679 100644 >> --- a/arch/arm/boot/dts/omap3-n9.dts >> +++ b/arch/arm/boot/dts/omap3-n9.dts >> @@ -36,6 +57,22 @@ >> }; >> }; >> }; >> + >> +touch@4b { > > touchscreen@ > >> +compatible = "atmel,maxtouch"; >> +reg = <0x4b>; >> +interrupt-parent = <>; >> +interrupts = <29 2>; /* gpio_61, IRQF_TRIGGER_FALLING*/ > > reset-gpios = < 17 GPIO_ACTIVE_SOMETHING>; > I'm using reset-gpios = < 17 0>; >> +vdd-supply = <>; >> +avdd-supply = <>; > > Those two are not mentioned in the binding and not supported by the > driver as far as I can see? > Right, but vio and vaux1 need to be on - the reason why it's working at all is because lis302 uses the same regulators and turns them on. IMHO either we add the support for regulators to maxtouch driver or we add regulator-always-on to vio and vaux1. >> +}; >> +}; > > Touchscreen with the same settings is required for n950, so it > should be in the shared n950 + n9 file. > As a side-note, there is no pinmux mentioned and usually I'd use OMAP3_CORE1_IOPAD(0x20c8, PIN_INPUT | MUX_MODE4) /* gpio_61*/ OMAP3_CORE1_IOPAD(0x20f2, PIN_OUTPUT | MUX_MODE4) /* gpio_81*/ For reasons that I can't explain, first line (gpmc_nbe1->gpio_61) breaks it for me, so I've commented it out. Still, if anyone has an idea what is wrong with that please let me know. >> + { >> +ak8975@0f { >> +compatible = "asahi-kasei,ak8975"; >> +reg = <0x0f>; >> +}; >> }; > > Looking at the N9 board file this is missing a rotation matrix. This > is supported by the binding: > > Documentation/devicetree/bindings/iio/magnetometer/ak8975.txt > >> >> { > > -- Sebastian > Best regards, Filip
Re: [PATCH] mm/fadvise: discard partial pages iff endbyte is also eof
> 在 2017年12月23日,12:16,十刀写道: > > From: "shidao.ytt" > > in commit 441c228f817f7 ("mm: fadvise: document the > fadvise(FADV_DONTNEED) behaviour for partial pages") Mel Gorman > explained why partial pages should be preserved instead of discarded > when using fadvise(FADV_DONTNEED), however the actual codes to calcuate > end_index was unexpectedly wrong, the code behavior didn't match to the > statement in comments; Luckily in another commit 18aba41cbf > ("mm/fadvise.c: do not discard partial pages with POSIX_FADV_DONTNEED") > Oleg Drokin fixed this behavior > > Here I come up with a new idea that actually we can still discard the > last parital page iff the page-unaligned endbyte is also the end of > file, since no one else will use the rest of the page and it should be > safe enough to discard. +akpm... Hi Mel, Andrew: Would you please take a look at this patch, to see if this proposal is reasonable enough, thanks in advance! Thanks, Caspar > > Signed-off-by: shidao.ytt > Signed-off-by: Caspar Zhang > --- > mm/fadvise.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/fadvise.c b/mm/fadvise.c > index ec70d6e..f74b21e 100644 > --- a/mm/fadvise.c > +++ b/mm/fadvise.c > @@ -127,7 +127,8 @@ >*/ > start_index = (offset+(PAGE_SIZE-1)) >> PAGE_SHIFT; > end_index = (endbyte >> PAGE_SHIFT); > - if ((endbyte & ~PAGE_MASK) != ~PAGE_MASK) { > + if ((endbyte & ~PAGE_MASK) != ~PAGE_MASK && > + endbyte != inode->i_size - 1) { > /* First page is tricky as 0 - 1 = -1, but pgoff_t >* is unsigned, so the end_index >= start_index >* check below would be true and we'll discard the whole > -- > 1.8.3.1
Re: [PATCH] mm/fadvise: discard partial pages iff endbyte is also eof
> 在 2017年12月23日,12:16,十刀 写道: > > From: "shidao.ytt" > > in commit 441c228f817f7 ("mm: fadvise: document the > fadvise(FADV_DONTNEED) behaviour for partial pages") Mel Gorman > explained why partial pages should be preserved instead of discarded > when using fadvise(FADV_DONTNEED), however the actual codes to calcuate > end_index was unexpectedly wrong, the code behavior didn't match to the > statement in comments; Luckily in another commit 18aba41cbf > ("mm/fadvise.c: do not discard partial pages with POSIX_FADV_DONTNEED") > Oleg Drokin fixed this behavior > > Here I come up with a new idea that actually we can still discard the > last parital page iff the page-unaligned endbyte is also the end of > file, since no one else will use the rest of the page and it should be > safe enough to discard. +akpm... Hi Mel, Andrew: Would you please take a look at this patch, to see if this proposal is reasonable enough, thanks in advance! Thanks, Caspar > > Signed-off-by: shidao.ytt > Signed-off-by: Caspar Zhang > --- > mm/fadvise.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/fadvise.c b/mm/fadvise.c > index ec70d6e..f74b21e 100644 > --- a/mm/fadvise.c > +++ b/mm/fadvise.c > @@ -127,7 +127,8 @@ >*/ > start_index = (offset+(PAGE_SIZE-1)) >> PAGE_SHIFT; > end_index = (endbyte >> PAGE_SHIFT); > - if ((endbyte & ~PAGE_MASK) != ~PAGE_MASK) { > + if ((endbyte & ~PAGE_MASK) != ~PAGE_MASK && > + endbyte != inode->i_size - 1) { > /* First page is tricky as 0 - 1 = -1, but pgoff_t >* is unsigned, so the end_index >= start_index >* check below would be true and we'll discard the whole > -- > 1.8.3.1
Re: [PATCH 2/3] dt-bindings: mtd: atmel-quadspi: add an optional property 'dmacap,memcpy'
On Tue, Jan 02, 2018 at 07:18:58PM +, Trent Piepho wrote: > On Tue, 2018-01-02 at 11:22 +0100, Ludovic Desroches wrote: > > On Wed, Dec 27, 2017 at 10:40:00PM +0100, Cyrille Pitchen wrote: > > > > > Or maybe no change at all is required at the at_xdmac.c driver side: we > > > just don't care about the provided flags in the "dmas" property, > > > especially > > > the "peripheral id". They would be ignored anyway when the atmel-quadspi.c > > > driver later calls dmaengine_prep_dma_memcpy(). So I could simply set the > > > dma cells to 0 in the device-tree? > > > > > > Ludovic, what do you think about that ? > > > > It may work but I won't do this. Usually, channels requested through the > > xlate > > function have usually their capaiblities set to DMA_SLAVE and not > > DMA_MEMCPY. > > In the at_xdmac case, it won't be an issue but if you have a controller > > which has channels which can support only mem-to-mem or peripheral, it > > won't work. > > Maybe one could create an "AT91_XDMAC_DT_" macro to indicate a memcpy > channel. There are still unused bits for another flag. It also looks > like at_xdma uses peripheral id 0x3f for memcpy transfers (will that > work with memcpy DMA on multiple channels at the same time?). So > perhaps perid 0x3f could be the indication of wanting a memcpy channel, > rather than another flag bit. But however it's done, one writes: > > dmas = < AT91_XDMAC_DT_MEMCPY>; dma-names = "rx-tx"; > If have no objection about doing that, my concerns are: - most (all ?) of the dma controllers used the xlate function to provide slave channel. Does it have to provide slave channel or can we use it for all kind of channel? From my point of view, we can do it, just need the confirmation. - this set of patches if focused on the atmel qspi controller but other ones may be interested in doing the same thing so they would have to update the behavior of the xlate function of the DMA controller they are using. So having the request of a DMA_MEMCPY channel inside the spi/qspi controller doesn't seem to be a wrong idea. Moreover, it may be confusing for the user who don't know the context: why do I have to use memcpy and not slave as usal? Honestly I have no opinion about the way to do it. Both have pros and cons. > I think one could have the quadspi driver automatically fill in the dma > cell in the dma specifier if it is not present in the device tree. So > one could write "dmas = <>" and the driver adds the > AT91_XDMAC_DT_MEMCPY cell before xlating. I'm not sure if that's a > good idea or not. I don't think so, there is enough black magic, let's try to not add more :p Regards Ludovic
Re: [PATCH 2/3] dt-bindings: mtd: atmel-quadspi: add an optional property 'dmacap,memcpy'
On Tue, Jan 02, 2018 at 07:18:58PM +, Trent Piepho wrote: > On Tue, 2018-01-02 at 11:22 +0100, Ludovic Desroches wrote: > > On Wed, Dec 27, 2017 at 10:40:00PM +0100, Cyrille Pitchen wrote: > > > > > Or maybe no change at all is required at the at_xdmac.c driver side: we > > > just don't care about the provided flags in the "dmas" property, > > > especially > > > the "peripheral id". They would be ignored anyway when the atmel-quadspi.c > > > driver later calls dmaengine_prep_dma_memcpy(). So I could simply set the > > > dma cells to 0 in the device-tree? > > > > > > Ludovic, what do you think about that ? > > > > It may work but I won't do this. Usually, channels requested through the > > xlate > > function have usually their capaiblities set to DMA_SLAVE and not > > DMA_MEMCPY. > > In the at_xdmac case, it won't be an issue but if you have a controller > > which has channels which can support only mem-to-mem or peripheral, it > > won't work. > > Maybe one could create an "AT91_XDMAC_DT_" macro to indicate a memcpy > channel. There are still unused bits for another flag. It also looks > like at_xdma uses peripheral id 0x3f for memcpy transfers (will that > work with memcpy DMA on multiple channels at the same time?). So > perhaps perid 0x3f could be the indication of wanting a memcpy channel, > rather than another flag bit. But however it's done, one writes: > > dmas = < AT91_XDMAC_DT_MEMCPY>; dma-names = "rx-tx"; > If have no objection about doing that, my concerns are: - most (all ?) of the dma controllers used the xlate function to provide slave channel. Does it have to provide slave channel or can we use it for all kind of channel? From my point of view, we can do it, just need the confirmation. - this set of patches if focused on the atmel qspi controller but other ones may be interested in doing the same thing so they would have to update the behavior of the xlate function of the DMA controller they are using. So having the request of a DMA_MEMCPY channel inside the spi/qspi controller doesn't seem to be a wrong idea. Moreover, it may be confusing for the user who don't know the context: why do I have to use memcpy and not slave as usal? Honestly I have no opinion about the way to do it. Both have pros and cons. > I think one could have the quadspi driver automatically fill in the dma > cell in the dma specifier if it is not present in the device tree. So > one could write "dmas = <>" and the driver adds the > AT91_XDMAC_DT_MEMCPY cell before xlating. I'm not sure if that's a > good idea or not. I don't think so, there is enough black magic, let's try to not add more :p Regards Ludovic
Re: [PATCH 0/2] perf-probe: Improve warning message for buildid mismatch
On 12/18/2017 12:58 PM, Masami Hiramatsu wrote: > Hello, > > This series ensure the build-ids for target binary and debuginfo > are matched. If there is a mismatch, it warns user to check the > package versions. For the series, Reviewed-by: Ravi Bangoria
Re: [PATCH 0/2] perf-probe: Improve warning message for buildid mismatch
On 12/18/2017 12:58 PM, Masami Hiramatsu wrote: > Hello, > > This series ensure the build-ids for target binary and debuginfo > are matched. If there is a mismatch, it warns user to check the > package versions. For the series, Reviewed-by: Ravi Bangoria
Re: [f2fs-dev] [PATCH v3] f2fs: add reserved blocks for root user
On 2018/1/3 3:24, Jaegeuk Kim wrote: >> How about adding uid & gid verification also like ext4? > > Again, that's another feature which requires a mount option. I think it'd be > better to add that, once we have a use-case. That's OK. ;) Thanks,
Re: [f2fs-dev] [PATCH v5] f2fs: add reserved blocks for root user
On 2018/1/3 10:21, Jaegeuk Kim wrote: > This patch allows root to reserve some blocks via mount option. > > "-o reserve_root=N" means N x 4KB-sized blocks for root only. > > Signed-off-by: Jaegeuk Kim> --- > > Change log from v4: > - fix f_bfree in statfs Could you fix f_bfree calculation issue in another patch prior to this patch? That will be better for history tracking of patches or git bisect when backtracking issues. One more thing, should we move reserve_root_limit check to parse_option? now, it looks that during remount we can set root_reserved_blocks exceeding our defined limitation. Thanks, > > fs/f2fs/f2fs.h | 26 ++ > fs/f2fs/super.c | 34 +- > fs/f2fs/sysfs.c | 3 ++- > 3 files changed, 53 insertions(+), 10 deletions(-) > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > index 07e03990420b..a0e8eec23125 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -95,6 +95,7 @@ extern char *fault_name[FAULT_MAX]; > #define F2FS_MOUNT_PRJQUOTA 0x0020 > #define F2FS_MOUNT_QUOTA 0x0040 > #define F2FS_MOUNT_INLINE_XATTR_SIZE 0x0080 > +#define F2FS_MOUNT_RESERVE_ROOT 0x0100 > > #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= > ~F2FS_MOUNT_##option) > #define set_opt(sbi, option) ((sbi)->mount_opt.opt |= F2FS_MOUNT_##option) > @@ -1105,6 +1106,7 @@ struct f2fs_sb_info { > block_t last_valid_block_count; /* for recovery */ > block_t reserved_blocks;/* configurable reserved blocks > */ > block_t current_reserved_blocks;/* current reserved blocks */ > + block_t root_reserved_blocks; /* root reserved blocks */ > > unsigned int nquota_files; /* # of quota sysfile */ > > @@ -1554,6 +1556,12 @@ static inline bool f2fs_has_xattr_block(unsigned int > ofs) > return ofs == XATTR_NODE_OFFSET; > } > > +static inline block_t reserve_root_limit(struct f2fs_sb_info *sbi) > +{ > + /* limit is 0.2% */ > + return (sbi->user_block_count << 1) / 1000; > +} > + > static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); > static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, >struct inode *inode, blkcnt_t *count) > @@ -1583,11 +1591,17 @@ static inline int inc_valid_block_count(struct > f2fs_sb_info *sbi, > sbi->total_valid_block_count += (block_t)(*count); > avail_user_block_count = sbi->user_block_count - > sbi->current_reserved_blocks; > + > + if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE))) > + avail_user_block_count -= sbi->root_reserved_blocks; > + > if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { > diff = sbi->total_valid_block_count - avail_user_block_count; > + if (diff > *count) > + diff = *count; > *count -= diff; > release = diff; > - sbi->total_valid_block_count = avail_user_block_count; > + sbi->total_valid_block_count -= diff; > if (!*count) { > spin_unlock(>stat_lock); > percpu_counter_sub(>alloc_valid_block_count, diff); > @@ -1776,9 +1790,13 @@ static inline int inc_valid_node_count(struct > f2fs_sb_info *sbi, > > spin_lock(>stat_lock); > > - valid_block_count = sbi->total_valid_block_count + 1; > - if (unlikely(valid_block_count + sbi->current_reserved_blocks > > - sbi->user_block_count)) { > + valid_block_count = sbi->total_valid_block_count + > + sbi->current_reserved_blocks + 1; > + > + if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE))) > + valid_block_count += sbi->root_reserved_blocks; > + > + if (unlikely(valid_block_count > sbi->user_block_count)) { > spin_unlock(>stat_lock); > goto enospc; > } > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c > index 5c6a02b558f0..e814340bc2f0 100644 > --- a/fs/f2fs/super.c > +++ b/fs/f2fs/super.c > @@ -107,6 +107,7 @@ enum { > Opt_noextent_cache, > Opt_noinline_data, > Opt_data_flush, > + Opt_reserve_root, > Opt_mode, > Opt_io_size_bits, > Opt_fault_injection, > @@ -157,6 +158,7 @@ static match_table_t f2fs_tokens = { > {Opt_noextent_cache, "noextent_cache"}, > {Opt_noinline_data, "noinline_data"}, > {Opt_data_flush, "data_flush"}, > + {Opt_reserve_root, "reserve_root=%u"}, > {Opt_mode, "mode=%s"}, > {Opt_io_size_bits, "io_bits=%u"}, > {Opt_fault_injection, "fault_injection=%u"}, > @@ -488,6 +490,18 @@ static int parse_options(struct super_block *sb, char > *options) > case Opt_data_flush: >
Re: [f2fs-dev] [PATCH v3] f2fs: add reserved blocks for root user
On 2018/1/3 3:24, Jaegeuk Kim wrote: >> How about adding uid & gid verification also like ext4? > > Again, that's another feature which requires a mount option. I think it'd be > better to add that, once we have a use-case. That's OK. ;) Thanks,
Re: [f2fs-dev] [PATCH v5] f2fs: add reserved blocks for root user
On 2018/1/3 10:21, Jaegeuk Kim wrote: > This patch allows root to reserve some blocks via mount option. > > "-o reserve_root=N" means N x 4KB-sized blocks for root only. > > Signed-off-by: Jaegeuk Kim > --- > > Change log from v4: > - fix f_bfree in statfs Could you fix f_bfree calculation issue in another patch prior to this patch? That will be better for history tracking of patches or git bisect when backtracking issues. One more thing, should we move reserve_root_limit check to parse_option? now, it looks that during remount we can set root_reserved_blocks exceeding our defined limitation. Thanks, > > fs/f2fs/f2fs.h | 26 ++ > fs/f2fs/super.c | 34 +- > fs/f2fs/sysfs.c | 3 ++- > 3 files changed, 53 insertions(+), 10 deletions(-) > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > index 07e03990420b..a0e8eec23125 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -95,6 +95,7 @@ extern char *fault_name[FAULT_MAX]; > #define F2FS_MOUNT_PRJQUOTA 0x0020 > #define F2FS_MOUNT_QUOTA 0x0040 > #define F2FS_MOUNT_INLINE_XATTR_SIZE 0x0080 > +#define F2FS_MOUNT_RESERVE_ROOT 0x0100 > > #define clear_opt(sbi, option) ((sbi)->mount_opt.opt &= > ~F2FS_MOUNT_##option) > #define set_opt(sbi, option) ((sbi)->mount_opt.opt |= F2FS_MOUNT_##option) > @@ -1105,6 +1106,7 @@ struct f2fs_sb_info { > block_t last_valid_block_count; /* for recovery */ > block_t reserved_blocks;/* configurable reserved blocks > */ > block_t current_reserved_blocks;/* current reserved blocks */ > + block_t root_reserved_blocks; /* root reserved blocks */ > > unsigned int nquota_files; /* # of quota sysfile */ > > @@ -1554,6 +1556,12 @@ static inline bool f2fs_has_xattr_block(unsigned int > ofs) > return ofs == XATTR_NODE_OFFSET; > } > > +static inline block_t reserve_root_limit(struct f2fs_sb_info *sbi) > +{ > + /* limit is 0.2% */ > + return (sbi->user_block_count << 1) / 1000; > +} > + > static inline void f2fs_i_blocks_write(struct inode *, block_t, bool, bool); > static inline int inc_valid_block_count(struct f2fs_sb_info *sbi, >struct inode *inode, blkcnt_t *count) > @@ -1583,11 +1591,17 @@ static inline int inc_valid_block_count(struct > f2fs_sb_info *sbi, > sbi->total_valid_block_count += (block_t)(*count); > avail_user_block_count = sbi->user_block_count - > sbi->current_reserved_blocks; > + > + if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE))) > + avail_user_block_count -= sbi->root_reserved_blocks; > + > if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) { > diff = sbi->total_valid_block_count - avail_user_block_count; > + if (diff > *count) > + diff = *count; > *count -= diff; > release = diff; > - sbi->total_valid_block_count = avail_user_block_count; > + sbi->total_valid_block_count -= diff; > if (!*count) { > spin_unlock(>stat_lock); > percpu_counter_sub(>alloc_valid_block_count, diff); > @@ -1776,9 +1790,13 @@ static inline int inc_valid_node_count(struct > f2fs_sb_info *sbi, > > spin_lock(>stat_lock); > > - valid_block_count = sbi->total_valid_block_count + 1; > - if (unlikely(valid_block_count + sbi->current_reserved_blocks > > - sbi->user_block_count)) { > + valid_block_count = sbi->total_valid_block_count + > + sbi->current_reserved_blocks + 1; > + > + if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE))) > + valid_block_count += sbi->root_reserved_blocks; > + > + if (unlikely(valid_block_count > sbi->user_block_count)) { > spin_unlock(>stat_lock); > goto enospc; > } > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c > index 5c6a02b558f0..e814340bc2f0 100644 > --- a/fs/f2fs/super.c > +++ b/fs/f2fs/super.c > @@ -107,6 +107,7 @@ enum { > Opt_noextent_cache, > Opt_noinline_data, > Opt_data_flush, > + Opt_reserve_root, > Opt_mode, > Opt_io_size_bits, > Opt_fault_injection, > @@ -157,6 +158,7 @@ static match_table_t f2fs_tokens = { > {Opt_noextent_cache, "noextent_cache"}, > {Opt_noinline_data, "noinline_data"}, > {Opt_data_flush, "data_flush"}, > + {Opt_reserve_root, "reserve_root=%u"}, > {Opt_mode, "mode=%s"}, > {Opt_io_size_bits, "io_bits=%u"}, > {Opt_fault_injection, "fault_injection=%u"}, > @@ -488,6 +490,18 @@ static int parse_options(struct super_block *sb, char > *options) > case Opt_data_flush: > set_opt(sbi,
[PATCH v2 1/4] dmaengine: xilinx_dma: populate dma caps properly
When client driver uses dma_get_slave_caps() api, it checks for certain fields of dma_device struct currently driver is not settings the directions and addr_widths fields resulting dma_get_slave_caps() returning failure. This patch fixes this issue by populating proper values to the struct dma_device directions and addr_widths fields. Signed-off-by: Kedareswara rao Appana--- Changes for v2: --> Improved commit message title and description as suggested by Vinod. drivers/dma/xilinx/xilinx_dma.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 88d317d..21ac954 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -2398,6 +2398,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev, chan->direction = DMA_MEM_TO_DEV; chan->id = chan_id; chan->tdest = chan_id; + xdev->common.directions = BIT(DMA_MEM_TO_DEV); chan->ctrl_offset = XILINX_DMA_MM2S_CTRL_OFFSET; if (xdev->dma_config->dmatype == XDMA_TYPE_VDMA) { @@ -2415,6 +2416,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev, chan->direction = DMA_DEV_TO_MEM; chan->id = chan_id; chan->tdest = chan_id - xdev->nr_channels; + xdev->common.directions |= BIT(DMA_DEV_TO_MEM); chan->ctrl_offset = XILINX_DMA_S2MM_CTRL_OFFSET; if (xdev->dma_config->dmatype == XDMA_TYPE_VDMA) { @@ -2629,6 +2631,8 @@ static int xilinx_dma_probe(struct platform_device *pdev) dma_cap_set(DMA_PRIVATE, xdev->common.cap_mask); } + xdev->common.dst_addr_widths = BIT(addr_width / 8); + xdev->common.src_addr_widths = BIT(addr_width / 8); xdev->common.device_alloc_chan_resources = xilinx_dma_alloc_chan_resources; xdev->common.device_free_chan_resources = -- 2.7.4
[PATCH v2 1/4] dmaengine: xilinx_dma: populate dma caps properly
When client driver uses dma_get_slave_caps() api, it checks for certain fields of dma_device struct currently driver is not settings the directions and addr_widths fields resulting dma_get_slave_caps() returning failure. This patch fixes this issue by populating proper values to the struct dma_device directions and addr_widths fields. Signed-off-by: Kedareswara rao Appana --- Changes for v2: --> Improved commit message title and description as suggested by Vinod. drivers/dma/xilinx/xilinx_dma.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 88d317d..21ac954 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -2398,6 +2398,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev, chan->direction = DMA_MEM_TO_DEV; chan->id = chan_id; chan->tdest = chan_id; + xdev->common.directions = BIT(DMA_MEM_TO_DEV); chan->ctrl_offset = XILINX_DMA_MM2S_CTRL_OFFSET; if (xdev->dma_config->dmatype == XDMA_TYPE_VDMA) { @@ -2415,6 +2416,7 @@ static int xilinx_dma_chan_probe(struct xilinx_dma_device *xdev, chan->direction = DMA_DEV_TO_MEM; chan->id = chan_id; chan->tdest = chan_id - xdev->nr_channels; + xdev->common.directions |= BIT(DMA_DEV_TO_MEM); chan->ctrl_offset = XILINX_DMA_S2MM_CTRL_OFFSET; if (xdev->dma_config->dmatype == XDMA_TYPE_VDMA) { @@ -2629,6 +2631,8 @@ static int xilinx_dma_probe(struct platform_device *pdev) dma_cap_set(DMA_PRIVATE, xdev->common.cap_mask); } + xdev->common.dst_addr_widths = BIT(addr_width / 8); + xdev->common.src_addr_widths = BIT(addr_width / 8); xdev->common.device_alloc_chan_resources = xilinx_dma_alloc_chan_resources; xdev->common.device_free_chan_resources = -- 2.7.4
[PATCH v2 3/4] dmaengine: xilinx_dma: Fix warning variable prev set but not used
This patch fixes the below sparse warning in the driver drivers/dma/xilinx/xilinx_dma.c: In function ‘xilinx_vdma_dma_prep_interleaved’: drivers/dma/xilinx/xilinx_dma.c:1614:43: warning: variable ‘prev’ set but not used [-Wunused-but-set-variable] struct xilinx_vdma_tx_segment *segment, *prev = NULL; Signed-off-by: Kedareswara rao Appana--- Changes for v2: --> Improved commit message title and description as suggested by Vinod. drivers/dma/xilinx/xilinx_dma.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 8467671..845e638 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -1611,7 +1611,7 @@ xilinx_vdma_dma_prep_interleaved(struct dma_chan *dchan, { struct xilinx_dma_chan *chan = to_xilinx_chan(dchan); struct xilinx_dma_tx_descriptor *desc; - struct xilinx_vdma_tx_segment *segment, *prev = NULL; + struct xilinx_vdma_tx_segment *segment; struct xilinx_vdma_desc_hw *hw; if (!is_slave_direction(xt->dir)) @@ -1665,8 +1665,6 @@ xilinx_vdma_dma_prep_interleaved(struct dma_chan *dchan, /* Insert the segment into the descriptor segments list. */ list_add_tail(>node, >segments); - prev = segment; - /* Link the last hardware descriptor with the first. */ segment = list_first_entry(>segments, struct xilinx_vdma_tx_segment, node); -- 2.7.4
[PATCH v2 0/4] dmaengine: xilinx_dma: Bug fixes
This patch series does the below --> Fixes sparse warnings in the driver. --> properly configures the SG mode bit in the driver for cdma. --> populates dma caps properly. This patch series got created on top of linux tag 4.15-rc4 i.e slave-dma.git next branch Kedareswara rao Appana (4): dmaengine: xilinx_dma: populate dma caps properly dmaengine: xilinx_dma: properly configure the SG mode bit in the driver for cdma dmaengine: xilinx_dma: Fix warning variable prev set but not used dmaengine: xilinx_dma: Free BD consistent memory drivers/dma/xilinx/xilinx_dma.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) -- 2.7.4
[PATCH v2 3/4] dmaengine: xilinx_dma: Fix warning variable prev set but not used
This patch fixes the below sparse warning in the driver drivers/dma/xilinx/xilinx_dma.c: In function ‘xilinx_vdma_dma_prep_interleaved’: drivers/dma/xilinx/xilinx_dma.c:1614:43: warning: variable ‘prev’ set but not used [-Wunused-but-set-variable] struct xilinx_vdma_tx_segment *segment, *prev = NULL; Signed-off-by: Kedareswara rao Appana --- Changes for v2: --> Improved commit message title and description as suggested by Vinod. drivers/dma/xilinx/xilinx_dma.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 8467671..845e638 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -1611,7 +1611,7 @@ xilinx_vdma_dma_prep_interleaved(struct dma_chan *dchan, { struct xilinx_dma_chan *chan = to_xilinx_chan(dchan); struct xilinx_dma_tx_descriptor *desc; - struct xilinx_vdma_tx_segment *segment, *prev = NULL; + struct xilinx_vdma_tx_segment *segment; struct xilinx_vdma_desc_hw *hw; if (!is_slave_direction(xt->dir)) @@ -1665,8 +1665,6 @@ xilinx_vdma_dma_prep_interleaved(struct dma_chan *dchan, /* Insert the segment into the descriptor segments list. */ list_add_tail(>node, >segments); - prev = segment; - /* Link the last hardware descriptor with the first. */ segment = list_first_entry(>segments, struct xilinx_vdma_tx_segment, node); -- 2.7.4
[PATCH v2 0/4] dmaengine: xilinx_dma: Bug fixes
This patch series does the below --> Fixes sparse warnings in the driver. --> properly configures the SG mode bit in the driver for cdma. --> populates dma caps properly. This patch series got created on top of linux tag 4.15-rc4 i.e slave-dma.git next branch Kedareswara rao Appana (4): dmaengine: xilinx_dma: populate dma caps properly dmaengine: xilinx_dma: properly configure the SG mode bit in the driver for cdma dmaengine: xilinx_dma: Fix warning variable prev set but not used dmaengine: xilinx_dma: Free BD consistent memory drivers/dma/xilinx/xilinx_dma.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) -- 2.7.4
[PATCH v2 2/4] dmaengine: xilinx_dma: properly configure the SG mode bit in the driver for cdma
If the hardware is configured for Scatter Gather(SG) mode, and hardware is idle, in the control register SG mode bit must be set to a 0 then back to 1 by the software, to force the CDMA SG engine to use a new value written to the CURDESC_PNTR register, failure to do so could result errors from the dmaengine. This patch updates the same. Signed-off-by: Kedareswara rao Appana--- Changes for v2: --> Improved commit message title and description as suggested by Vinod. drivers/dma/xilinx/xilinx_dma.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 21ac954..8467671 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -1204,6 +1204,12 @@ static void xilinx_cdma_start_transfer(struct xilinx_dma_chan *chan) } if (chan->has_sg) { + dma_ctrl_clr(chan, XILINX_DMA_REG_DMACR, +XILINX_CDMA_CR_SGMODE); + + dma_ctrl_set(chan, XILINX_DMA_REG_DMACR, +XILINX_CDMA_CR_SGMODE); + xilinx_write(chan, XILINX_DMA_REG_CURDESC, head_desc->async_tx.phys); @@ -2052,6 +2058,10 @@ static int xilinx_dma_terminate_all(struct dma_chan *dchan) chan->cyclic = false; } + if ((chan->xdev->dma_config->dmatype == XDMA_TYPE_CDMA) && chan->has_sg) + dma_ctrl_clr(chan, XILINX_DMA_REG_DMACR, +XILINX_CDMA_CR_SGMODE); + return 0; } -- 2.7.4
[PATCH v2 2/4] dmaengine: xilinx_dma: properly configure the SG mode bit in the driver for cdma
If the hardware is configured for Scatter Gather(SG) mode, and hardware is idle, in the control register SG mode bit must be set to a 0 then back to 1 by the software, to force the CDMA SG engine to use a new value written to the CURDESC_PNTR register, failure to do so could result errors from the dmaengine. This patch updates the same. Signed-off-by: Kedareswara rao Appana --- Changes for v2: --> Improved commit message title and description as suggested by Vinod. drivers/dma/xilinx/xilinx_dma.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 21ac954..8467671 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -1204,6 +1204,12 @@ static void xilinx_cdma_start_transfer(struct xilinx_dma_chan *chan) } if (chan->has_sg) { + dma_ctrl_clr(chan, XILINX_DMA_REG_DMACR, +XILINX_CDMA_CR_SGMODE); + + dma_ctrl_set(chan, XILINX_DMA_REG_DMACR, +XILINX_CDMA_CR_SGMODE); + xilinx_write(chan, XILINX_DMA_REG_CURDESC, head_desc->async_tx.phys); @@ -2052,6 +2058,10 @@ static int xilinx_dma_terminate_all(struct dma_chan *dchan) chan->cyclic = false; } + if ((chan->xdev->dma_config->dmatype == XDMA_TYPE_CDMA) && chan->has_sg) + dma_ctrl_clr(chan, XILINX_DMA_REG_DMACR, +XILINX_CDMA_CR_SGMODE); + return 0; } -- 2.7.4
Re: [PATCH v3 00/27] kill devm_ioremap_nocache
+ cris/ia64/mn10300/openrisc maintainers On 2017/12/25 9:09, Yisheng Xie wrote: > hi Christophe and Greg, > > On 2017/12/24 16:55, christophe leroy wrote: >> >> >> Le 23/12/2017 à 16:57, Guenter Roeck a écrit : >>> On 12/23/2017 05:48 AM, Greg KH wrote: On Sat, Dec 23, 2017 at 06:55:25PM +0800, Yisheng Xie wrote: > Hi all, > > When I tried to use devm_ioremap function and review related code, I found > devm_ioremap and devm_ioremap_nocache is almost the same with each other, > except one use ioremap while the other use ioremap_nocache. For all arches? Really? Look at MIPS, and x86, they have different functions. >>> >>> Both mips and x86 end up mapping the same function, but other arches don't. >>> mn10300 is one where ioremap and ioremap_nocache are definitely different. >> >> alpha: identical >> arc: identical >> arm: identical >> arm64: identical >> cris: different<== >> frv: identical >> hexagone: identical >> ia64: different<== >> m32r: identical >> m68k: identical >> metag: identical >> microblaze: identical >> mips: identical >> mn10300: different <== >> nios: identical >> openrisc: different<== >> parisc: identical >> riscv: identical >> s390: identical >> sh: identical >> sparc: identical >> tile: identical >> um: rely on asm/generic >> unicore32: identical >> x86: identical >> asm/generic (no mmu): identical > > Wow, that's correct, sorry for I have just checked the main archs, I means > x86,arm, arm64, mips. > > However, I stall have no idea about why these 4 archs want different ioremap > function with others. Drivers seems cannot aware this? If driver call ioremap > want he really want for there 4 archs, cache or nocache? Could you please help about this? it is out of my knowledge. Thanks Yisheng > >> >> So 4 among all arches seems to have ioremap() and ioremap_nocache() being >> different. >> >> Could we have a define set by the 4 arches on which ioremap() and >> ioremap_nocache() are different, something like >> HAVE_DIFFERENT_IOREMAP_NOCACHE ? > > Then, what the HAVE_DIFFERENT_IOREMAP_NOCACHE is uesed for ? > > Thanks > Yisheng >> >> Christophe >> >>> >>> Guenter >>> > While ioremap's > default function is ioremap_nocache, so devm_ioremap_nocache also have the > same function with devm_ioremap, which can just be killed to reduce the > size > of devres.o(from 20304 bytes to 18992 bytes in my compile environment). > > I have posted two versions, which use macro instead of function for > devm_ioremap_nocache[1] or devm_ioremap[2]. And Greg suggest me to kill > devm_ioremap_nocache for no need to keep a macro around for the duplicate > thing. So here comes v3 and please help to review. I don't think this can be done, what am I missing? These functions are not identical, sorry for missing that before. > > Never mind, I should checked all the arches, sorry about that. > thanks, greg k-h >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> --- >> L'absence de virus dans ce courrier électronique a été vérifiée par le >> logiciel antivirus Avast. >> https://www.avast.com/antivirus >> >> >> . >> > > > . >
Re: [PATCH v3 00/27] kill devm_ioremap_nocache
+ cris/ia64/mn10300/openrisc maintainers On 2017/12/25 9:09, Yisheng Xie wrote: > hi Christophe and Greg, > > On 2017/12/24 16:55, christophe leroy wrote: >> >> >> Le 23/12/2017 à 16:57, Guenter Roeck a écrit : >>> On 12/23/2017 05:48 AM, Greg KH wrote: On Sat, Dec 23, 2017 at 06:55:25PM +0800, Yisheng Xie wrote: > Hi all, > > When I tried to use devm_ioremap function and review related code, I found > devm_ioremap and devm_ioremap_nocache is almost the same with each other, > except one use ioremap while the other use ioremap_nocache. For all arches? Really? Look at MIPS, and x86, they have different functions. >>> >>> Both mips and x86 end up mapping the same function, but other arches don't. >>> mn10300 is one where ioremap and ioremap_nocache are definitely different. >> >> alpha: identical >> arc: identical >> arm: identical >> arm64: identical >> cris: different<== >> frv: identical >> hexagone: identical >> ia64: different<== >> m32r: identical >> m68k: identical >> metag: identical >> microblaze: identical >> mips: identical >> mn10300: different <== >> nios: identical >> openrisc: different<== >> parisc: identical >> riscv: identical >> s390: identical >> sh: identical >> sparc: identical >> tile: identical >> um: rely on asm/generic >> unicore32: identical >> x86: identical >> asm/generic (no mmu): identical > > Wow, that's correct, sorry for I have just checked the main archs, I means > x86,arm, arm64, mips. > > However, I stall have no idea about why these 4 archs want different ioremap > function with others. Drivers seems cannot aware this? If driver call ioremap > want he really want for there 4 archs, cache or nocache? Could you please help about this? it is out of my knowledge. Thanks Yisheng > >> >> So 4 among all arches seems to have ioremap() and ioremap_nocache() being >> different. >> >> Could we have a define set by the 4 arches on which ioremap() and >> ioremap_nocache() are different, something like >> HAVE_DIFFERENT_IOREMAP_NOCACHE ? > > Then, what the HAVE_DIFFERENT_IOREMAP_NOCACHE is uesed for ? > > Thanks > Yisheng >> >> Christophe >> >>> >>> Guenter >>> > While ioremap's > default function is ioremap_nocache, so devm_ioremap_nocache also have the > same function with devm_ioremap, which can just be killed to reduce the > size > of devres.o(from 20304 bytes to 18992 bytes in my compile environment). > > I have posted two versions, which use macro instead of function for > devm_ioremap_nocache[1] or devm_ioremap[2]. And Greg suggest me to kill > devm_ioremap_nocache for no need to keep a macro around for the duplicate > thing. So here comes v3 and please help to review. I don't think this can be done, what am I missing? These functions are not identical, sorry for missing that before. > > Never mind, I should checked all the arches, sorry about that. > thanks, greg k-h >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> --- >> L'absence de virus dans ce courrier électronique a été vérifiée par le >> logiciel antivirus Avast. >> https://www.avast.com/antivirus >> >> >> . >> > > > . >
[PATCH v2 4/4] dmaengine: xilinx_dma: Free BD consistent memory
Free BD consistent memory while freeing the channel i.e in free_chan_resources. Signed-off-by: Radhey Shyam PandeySigned-off-by: Kedareswara rao Appana --- Changes for v2: --> None. drivers/dma/xilinx/xilinx_dma.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 845e638..a9edbd8 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -764,6 +764,11 @@ static void xilinx_dma_free_chan_resources(struct dma_chan *dchan) INIT_LIST_HEAD(>free_seg_list); spin_unlock_irqrestore(>lock, flags); + /* Free memory that is allocated for BD */ + dma_free_coherent(chan->dev, sizeof(*chan->seg_v) * + XILINX_DMA_NUM_DESCS, chan->seg_v, + chan->seg_p); + /* Free Memory that is allocated for cyclic DMA Mode */ dma_free_coherent(chan->dev, sizeof(*chan->cyclic_seg_v), chan->cyclic_seg_v, chan->cyclic_seg_p); -- 2.7.4
[PATCH v2 4/4] dmaengine: xilinx_dma: Free BD consistent memory
Free BD consistent memory while freeing the channel i.e in free_chan_resources. Signed-off-by: Radhey Shyam Pandey Signed-off-by: Kedareswara rao Appana --- Changes for v2: --> None. drivers/dma/xilinx/xilinx_dma.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 845e638..a9edbd8 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -764,6 +764,11 @@ static void xilinx_dma_free_chan_resources(struct dma_chan *dchan) INIT_LIST_HEAD(>free_seg_list); spin_unlock_irqrestore(>lock, flags); + /* Free memory that is allocated for BD */ + dma_free_coherent(chan->dev, sizeof(*chan->seg_v) * + XILINX_DMA_NUM_DESCS, chan->seg_v, + chan->seg_p); + /* Free Memory that is allocated for cyclic DMA Mode */ dma_free_coherent(chan->dev, sizeof(*chan->cyclic_seg_v), chan->cyclic_seg_v, chan->cyclic_seg_p); -- 2.7.4
Re: [PATCH v2] regulator: sc2731: Fix defines for SC2731_WR_UNLOCK and SC2731_PWR_WR_PROT_VALUE
Hi Axel, On 一, 1月 01, 2018 at 08:38:50下午 +0800, Axel Lin wrote: > The defines for SC2731_WR_UNLOCK and SC2731_PWR_WR_PROT_VALUE makes > regmap_write() call looks strange because it takes reg parameter fist > then val. > Base on Erick's suggestion to define SC2731_PWR_WR_PROT and > SC2731_WR_UNLOCK_VALUE instead. > > Signed-off-by: Axel LinReviewed-by: Erick Chen >
Re: [PATCH v2] regulator: sc2731: Fix defines for SC2731_WR_UNLOCK and SC2731_PWR_WR_PROT_VALUE
Hi Axel, On 一, 1月 01, 2018 at 08:38:50下午 +0800, Axel Lin wrote: > The defines for SC2731_WR_UNLOCK and SC2731_PWR_WR_PROT_VALUE makes > regmap_write() call looks strange because it takes reg parameter fist > then val. > Base on Erick's suggestion to define SC2731_PWR_WR_PROT and > SC2731_WR_UNLOCK_VALUE instead. > > Signed-off-by: Axel Lin Reviewed-by: Erick Chen >
Re: WARNING in adjust_ptr_min_max_vals
On Tue, Jan 02, 2018 at 08:58:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 0e08c463db387a2adcb0243b15ab868a73f87807 > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+6d362cadd45dc0a12...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > audit: type=1400 audit(1514685224.971:7): avc: denied { map } for > pid=3144 comm="syzkaller663366" path="/root/syzkaller663366580" dev="sda1" > ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > WARNING: CPU: 1 PID: 3144 at kernel/bpf/verifier.c:2359 > adjust_ptr_min_max_vals+0x977/0x20a0 kernel/bpf/verifier.c:2359 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 1 PID: 3144 Comm: syzkaller663366 Not tainted 4.15.0-rc4-next-20171221+ > #78 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > panic+0x1e4/0x41c kernel/panic.c:183 > __warn+0x1dc/0x200 kernel/panic.c:547 > report_bug+0x211/0x2d0 lib/bug.c:184 > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:177 > fixup_bug arch/x86/kernel/traps.c:246 [inline] > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:295 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:314 > invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 > RIP: 0010:adjust_ptr_min_max_vals+0x977/0x20a0 kernel/bpf/verifier.c:2359 > RSP: 0018:8801c97ef198 EFLAGS: 00010293 > RAX: 8801c94e8240 RBX: 8801c8ee4b00 RCX: 817eebb7 > RDX: RSI: c9002048 RDI: c9002049 > RBP: 8801c97ef228 R08: R09: 858fa920 > R10: 0071 R11: 858f9d00 R12: > R13: 0001 R14: c9002048 R15: 8801c8e02040 > adjust_reg_min_max_vals kernel/bpf/verifier.c:2799 [inline] > check_alu_op kernel/bpf/verifier.c:2997 [inline] > do_check+0x67e0/0xae20 kernel/bpf/verifier.c:4448 > bpf_check+0x2b1b/0x49f0 kernel/bpf/verifier.c:5374 > bpf_prog_load+0xa2a/0x1b00 kernel/bpf/syscall.c:1192 > SYSC_bpf kernel/bpf/syscall.c:1724 [inline] > SyS_bpf+0x1044/0x4420 kernel/bpf/syscall.c:1686 > entry_SYSCALL_64_fastpath+0x1f/0x96 that's an interesting bug. If I decipher fuzzed bpf insns correctly the sequence: r0 = 0x0 if r0 s<= 0x0 goto pc+0 r0 -= r1 causes: if (WARN_ON_ONCE(known && (smin_val != smax_val))) { and smin_val=1 smax_val=0 since the verifier did: case BPF_JSLE: false_reg->smin_value = max_t(s64, false_reg->smin_value, val + 1); Not sure what the best fix yet.
Re: WARNING in adjust_ptr_min_max_vals
On Tue, Jan 02, 2018 at 08:58:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 0e08c463db387a2adcb0243b15ab868a73f87807 > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+6d362cadd45dc0a12...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > audit: type=1400 audit(1514685224.971:7): avc: denied { map } for > pid=3144 comm="syzkaller663366" path="/root/syzkaller663366580" dev="sda1" > ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > WARNING: CPU: 1 PID: 3144 at kernel/bpf/verifier.c:2359 > adjust_ptr_min_max_vals+0x977/0x20a0 kernel/bpf/verifier.c:2359 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 1 PID: 3144 Comm: syzkaller663366 Not tainted 4.15.0-rc4-next-20171221+ > #78 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > panic+0x1e4/0x41c kernel/panic.c:183 > __warn+0x1dc/0x200 kernel/panic.c:547 > report_bug+0x211/0x2d0 lib/bug.c:184 > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:177 > fixup_bug arch/x86/kernel/traps.c:246 [inline] > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:295 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:314 > invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 > RIP: 0010:adjust_ptr_min_max_vals+0x977/0x20a0 kernel/bpf/verifier.c:2359 > RSP: 0018:8801c97ef198 EFLAGS: 00010293 > RAX: 8801c94e8240 RBX: 8801c8ee4b00 RCX: 817eebb7 > RDX: RSI: c9002048 RDI: c9002049 > RBP: 8801c97ef228 R08: R09: 858fa920 > R10: 0071 R11: 858f9d00 R12: > R13: 0001 R14: c9002048 R15: 8801c8e02040 > adjust_reg_min_max_vals kernel/bpf/verifier.c:2799 [inline] > check_alu_op kernel/bpf/verifier.c:2997 [inline] > do_check+0x67e0/0xae20 kernel/bpf/verifier.c:4448 > bpf_check+0x2b1b/0x49f0 kernel/bpf/verifier.c:5374 > bpf_prog_load+0xa2a/0x1b00 kernel/bpf/syscall.c:1192 > SYSC_bpf kernel/bpf/syscall.c:1724 [inline] > SyS_bpf+0x1044/0x4420 kernel/bpf/syscall.c:1686 > entry_SYSCALL_64_fastpath+0x1f/0x96 that's an interesting bug. If I decipher fuzzed bpf insns correctly the sequence: r0 = 0x0 if r0 s<= 0x0 goto pc+0 r0 -= r1 causes: if (WARN_ON_ONCE(known && (smin_val != smax_val))) { and smin_val=1 smax_val=0 since the verifier did: case BPF_JSLE: false_reg->smin_value = max_t(s64, false_reg->smin_value, val + 1); Not sure what the best fix yet.
Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622
On Tue, 2018-01-02 at 10:56 +, Lorenzo Pieralisi wrote: > On Thu, Dec 28, 2017 at 09:39:12AM +0800, Honghui Zhang wrote: > > On Wed, 2017-12-27 at 12:45 -0600, Bjorn Helgaas wrote: > > > On Wed, Dec 27, 2017 at 08:59:54AM +0800, honghui.zh...@mediatek.com > > > wrote: > > > > From: Honghui Zhang> > > > > > > > + /* Set up class code for MT7622 */ > > > > + val = PCI_CLASS_BRIDGE_PCI << 16; > > > > + writel(val, port->base + PCIE_CONF_CLASS); > > > > > > 1) Your comments mention MT7622 specifically, but this code is run for > > > both mt2712-pcie and mt7622-pcie. If this code is safe and necessary > > > for both mt2712-pcie and mt7622-pcie, please remove the mention of > > > MT7622. > > > > Hmm, the code snippet added here will only be executed by MT7622, since > > MT2712 will not enter this "if (pcie->base) {" condition. > > Should the mention of MT7622 must be removed in this case? > > You should add an explicit way (eg of_device_is_compatible() match for > instance) to apply the quirk just on the platform that requires it. > > Checking for "if (pcie->base)" is really not the way to do it. > hi, Lorenzo, Thanks very much for your advise. Passing the compatible string or platform data into this function needed to add some new field in the struct mtk_pcie_port, then I guess both set it for MT2712 and MT7622 is an easy way, since re-setting those values for MT2712 is safe. > > > 2) The first comment mentions both "vendor ID and device ID" but you > > > don't write the device ID. Since this code applies to both > > > mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to > > > write the device ID. If that's the case, please fix the comment. > > > > > > > My bad, I did not check the comments carefully. > > Thanks. > > > > > 3) If you only need to set the vendor ID, you're performing a 32-bit > > > write (writel()) to update a 16-bit value. Please use writew() > > > instead. > > > > > > > Ok, thanks, I guess I could use the following code snippet in the next > > version: > > val = readl(port->base + PCIE_CONF_VENDOR_ID) > > val &= ~GENMASK(15, 0); > > val |= PCI_VENDOR_ID_MEDIATEK; > > writel(val, port->base + PCIE_CONF_VENDOR_ID); > > Have you read Bjorn's comment ? Or there is a problem with using > a writew() ? > This control register is a 32bit register, I'm not sure whether the apb bus support write an 16bit value with 16bit but not 32bit address alignment. I prefer the more safety old way of writel. I need to do more test about the writew if the code elegant is more important. thanks. > Lorenzo > > > > 4) If you only need to set the vendor ID, please use a definition like > > > "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID". > > > > > > 5) If you only need to set the vendor ID, please update the changelog > > > to mention "vendor ID" specifically instead of the ambiguous "IDs". > > > > > 6) Please add a space before the closing "*/" of the first comment. > > > > > > 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has > > > PCI on both the primary (upstream) side and the secondary (downstream) > > > side. That kind of bridge has a type 1 config header (see > > > PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS > > > registers tell us the bus number of the primary and secondary sides. > > > > > > I don't believe this device is a PCI-to-PCI bridge. I think it's a > > > *host* bridge that has some non-PCI interface on the upstream side and > > > should have a type 0 config header. If that's the case you should use > > > PCI_CLASS_BRIDGE_HOST instead. > > > > > > > Thanks very much for your help with the review, I will fix the other > > issue in the next version. > > > > > > } > > > > > > > > /* Assert all reset signals */ > > > > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h > > > > index ab20dc5..2480b0e 100644 > > > > --- a/include/linux/pci_ids.h > > > > +++ b/include/linux/pci_ids.h > > > > @@ -2113,6 +2113,8 @@ > > > > > > > > #define PCI_VENDOR_ID_MYRICOM 0x14c1 > > > > > > > > +#define PCI_VENDOR_ID_MEDIATEK 0x14c3 > > > > + > > > > #define PCI_VENDOR_ID_TITAN0x14D2 > > > > #define PCI_DEVICE_ID_TITAN_010L 0x8001 > > > > #define PCI_DEVICE_ID_TITAN_100L 0x8010 > > > > -- > > > > 2.6.4 > > > > > > > >
Re: [PATCH v5 2/2] PCI: mediatek: Set up class type and vendor ID for MT7622
On Tue, 2018-01-02 at 10:56 +, Lorenzo Pieralisi wrote: > On Thu, Dec 28, 2017 at 09:39:12AM +0800, Honghui Zhang wrote: > > On Wed, 2017-12-27 at 12:45 -0600, Bjorn Helgaas wrote: > > > On Wed, Dec 27, 2017 at 08:59:54AM +0800, honghui.zh...@mediatek.com > > > wrote: > > > > From: Honghui Zhang > > > > > > > > + /* Set up class code for MT7622 */ > > > > + val = PCI_CLASS_BRIDGE_PCI << 16; > > > > + writel(val, port->base + PCIE_CONF_CLASS); > > > > > > 1) Your comments mention MT7622 specifically, but this code is run for > > > both mt2712-pcie and mt7622-pcie. If this code is safe and necessary > > > for both mt2712-pcie and mt7622-pcie, please remove the mention of > > > MT7622. > > > > Hmm, the code snippet added here will only be executed by MT7622, since > > MT2712 will not enter this "if (pcie->base) {" condition. > > Should the mention of MT7622 must be removed in this case? > > You should add an explicit way (eg of_device_is_compatible() match for > instance) to apply the quirk just on the platform that requires it. > > Checking for "if (pcie->base)" is really not the way to do it. > hi, Lorenzo, Thanks very much for your advise. Passing the compatible string or platform data into this function needed to add some new field in the struct mtk_pcie_port, then I guess both set it for MT2712 and MT7622 is an easy way, since re-setting those values for MT2712 is safe. > > > 2) The first comment mentions both "vendor ID and device ID" but you > > > don't write the device ID. Since this code applies to both > > > mt2712-pcie and mt7622-pcie, my guess is that you don't *want* to > > > write the device ID. If that's the case, please fix the comment. > > > > > > > My bad, I did not check the comments carefully. > > Thanks. > > > > > 3) If you only need to set the vendor ID, you're performing a 32-bit > > > write (writel()) to update a 16-bit value. Please use writew() > > > instead. > > > > > > > Ok, thanks, I guess I could use the following code snippet in the next > > version: > > val = readl(port->base + PCIE_CONF_VENDOR_ID) > > val &= ~GENMASK(15, 0); > > val |= PCI_VENDOR_ID_MEDIATEK; > > writel(val, port->base + PCIE_CONF_VENDOR_ID); > > Have you read Bjorn's comment ? Or there is a problem with using > a writew() ? > This control register is a 32bit register, I'm not sure whether the apb bus support write an 16bit value with 16bit but not 32bit address alignment. I prefer the more safety old way of writel. I need to do more test about the writew if the code elegant is more important. thanks. > Lorenzo > > > > 4) If you only need to set the vendor ID, please use a definition like > > > "PCIE_CONF_VENDOR_ID" instead of the ambiguous "PCIE_CONF_ID". > > > > > > 5) If you only need to set the vendor ID, please update the changelog > > > to mention "vendor ID" specifically instead of the ambiguous "IDs". > > > > > 6) Please add a space before the closing "*/" of the first comment. > > > > > > 7) PCI_CLASS_BRIDGE_PCI is for a PCI-to-PCI bridge, i.e., one that has > > > PCI on both the primary (upstream) side and the secondary (downstream) > > > side. That kind of bridge has a type 1 config header (see > > > PCI_HEADER_TYPE) and the PCI_PRIMARY_BUS and PCI_SECONDARY_BUS > > > registers tell us the bus number of the primary and secondary sides. > > > > > > I don't believe this device is a PCI-to-PCI bridge. I think it's a > > > *host* bridge that has some non-PCI interface on the upstream side and > > > should have a type 0 config header. If that's the case you should use > > > PCI_CLASS_BRIDGE_HOST instead. > > > > > > > Thanks very much for your help with the review, I will fix the other > > issue in the next version. > > > > > > } > > > > > > > > /* Assert all reset signals */ > > > > diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h > > > > index ab20dc5..2480b0e 100644 > > > > --- a/include/linux/pci_ids.h > > > > +++ b/include/linux/pci_ids.h > > > > @@ -2113,6 +2113,8 @@ > > > > > > > > #define PCI_VENDOR_ID_MYRICOM 0x14c1 > > > > > > > > +#define PCI_VENDOR_ID_MEDIATEK 0x14c3 > > > > + > > > > #define PCI_VENDOR_ID_TITAN0x14D2 > > > > #define PCI_DEVICE_ID_TITAN_010L 0x8001 > > > > #define PCI_DEVICE_ID_TITAN_100L 0x8010 > > > > -- > > > > 2.6.4 > > > > > > > >
[PATCH] irqchip/gic-v3-its: Add workaround for ThunderX2 erratum #174
When an interrupt is moved across node collections on ThunderX2 multi Socket platform, an interrupt stops routed to new collection and results in loss of interrupts. Adding workaround to issue INV after MOVI for cross-node collection move to flush out the cached entry. Signed-off-by: Ganapatrao Kulkarni--- Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 11 +++ drivers/irqchip/irq-gic-v3-its.c | 24 3 files changed, 36 insertions(+) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index fc1c884..fb27cb5 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -63,6 +63,7 @@ stable kernels. | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | | Cavium | ThunderX Core | #30115 | CAVIUM_ERRATUM_30115 | | Cavium | ThunderX SMMUv2 | #27704 | N/A | +| Cavium | ThunderX2 ITS | #174| CAVIUM_ERRATUM_174 | | Cavium | ThunderX2 SMMUv3| #74 | N/A | | Cavium | ThunderX2 SMMUv3| #126| N/A | || | | | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c9a7e9e..71a7e30 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -461,6 +461,17 @@ config ARM64_ERRATUM_843419 If unsure, say Y. +config CAVIUM_ERRATUM_174 + bool "Cavium ThunderX2 erratum 174" + depends on NUMA + default y + help + LPI stops routed to redistributors after inter node collection + move in ITS. Enable workaround to invalidate ITS entry after + inter-node collection move. + + If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 06f025f..d8b9c96 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -46,6 +46,7 @@ #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1ULL << 0) #define ITS_FLAGS_WORKAROUND_CAVIUM_22375 (1ULL << 1) #define ITS_FLAGS_WORKAROUND_CAVIUM_23144 (1ULL << 2) +#define ITS_FLAGS_WORKAROUND_CAVIUM_174(1ULL << 3) #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING(1 << 0) @@ -1119,6 +1120,12 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val, if (cpu != its_dev->event_map.col_map[id]) { target_col = _dev->its->collections[cpu]; its_send_movi(its_dev, target_col, id); + if (its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_174) { + /* Issue INV for cross node collection move. */ + if (cpu_to_node(cpu) != + cpu_to_node(its_dev->event_map.col_map[id])) + its_send_inv(its_dev, id); + } its_dev->event_map.col_map[id] = cpu; irq_data_update_effective_affinity(d, cpumask_of(cpu)); } @@ -2904,6 +2911,15 @@ static int its_force_quiescent(void __iomem *base) } } +static bool __maybe_unused its_enable_quirk_cavium_174(void *data) +{ + struct its_node *its = data; + + its->flags |= ITS_FLAGS_WORKAROUND_CAVIUM_174; + + return true; +} + static bool __maybe_unused its_enable_quirk_cavium_22375(void *data) { struct its_node *its = data; @@ -3031,6 +3047,14 @@ static const struct gic_quirk its_quirks[] = { .init = its_enable_quirk_hip07_161600802, }, #endif +#ifdef CONFIG_CAVIUM_ERRATUM_174 + { + .desc = "ITS: Cavium ThunderX2 erratum 174", + .iidr = 0x13f,/* ThunderX2 pass A1/A2/B0 */ + .mask = 0x, + .init = its_enable_quirk_cavium_174, + }, +#endif { } }; -- 2.9.4
[PATCH] irqchip/gic-v3-its: Add workaround for ThunderX2 erratum #174
When an interrupt is moved across node collections on ThunderX2 multi Socket platform, an interrupt stops routed to new collection and results in loss of interrupts. Adding workaround to issue INV after MOVI for cross-node collection move to flush out the cached entry. Signed-off-by: Ganapatrao Kulkarni --- Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 11 +++ drivers/irqchip/irq-gic-v3-its.c | 24 3 files changed, 36 insertions(+) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index fc1c884..fb27cb5 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -63,6 +63,7 @@ stable kernels. | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | | Cavium | ThunderX Core | #30115 | CAVIUM_ERRATUM_30115 | | Cavium | ThunderX SMMUv2 | #27704 | N/A | +| Cavium | ThunderX2 ITS | #174| CAVIUM_ERRATUM_174 | | Cavium | ThunderX2 SMMUv3| #74 | N/A | | Cavium | ThunderX2 SMMUv3| #126| N/A | || | | | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c9a7e9e..71a7e30 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -461,6 +461,17 @@ config ARM64_ERRATUM_843419 If unsure, say Y. +config CAVIUM_ERRATUM_174 + bool "Cavium ThunderX2 erratum 174" + depends on NUMA + default y + help + LPI stops routed to redistributors after inter node collection + move in ITS. Enable workaround to invalidate ITS entry after + inter-node collection move. + + If unsure, say Y. + config CAVIUM_ERRATUM_22375 bool "Cavium erratum 22375, 24313" default y diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 06f025f..d8b9c96 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -46,6 +46,7 @@ #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1ULL << 0) #define ITS_FLAGS_WORKAROUND_CAVIUM_22375 (1ULL << 1) #define ITS_FLAGS_WORKAROUND_CAVIUM_23144 (1ULL << 2) +#define ITS_FLAGS_WORKAROUND_CAVIUM_174(1ULL << 3) #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING(1 << 0) @@ -1119,6 +1120,12 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val, if (cpu != its_dev->event_map.col_map[id]) { target_col = _dev->its->collections[cpu]; its_send_movi(its_dev, target_col, id); + if (its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_174) { + /* Issue INV for cross node collection move. */ + if (cpu_to_node(cpu) != + cpu_to_node(its_dev->event_map.col_map[id])) + its_send_inv(its_dev, id); + } its_dev->event_map.col_map[id] = cpu; irq_data_update_effective_affinity(d, cpumask_of(cpu)); } @@ -2904,6 +2911,15 @@ static int its_force_quiescent(void __iomem *base) } } +static bool __maybe_unused its_enable_quirk_cavium_174(void *data) +{ + struct its_node *its = data; + + its->flags |= ITS_FLAGS_WORKAROUND_CAVIUM_174; + + return true; +} + static bool __maybe_unused its_enable_quirk_cavium_22375(void *data) { struct its_node *its = data; @@ -3031,6 +3047,14 @@ static const struct gic_quirk its_quirks[] = { .init = its_enable_quirk_hip07_161600802, }, #endif +#ifdef CONFIG_CAVIUM_ERRATUM_174 + { + .desc = "ITS: Cavium ThunderX2 erratum 174", + .iidr = 0x13f,/* ThunderX2 pass A1/A2/B0 */ + .mask = 0x, + .init = its_enable_quirk_cavium_174, + }, +#endif { } }; -- 2.9.4
Re: [PATCH] KVM: nVMX: remove unnecessary vmwrite from L2->L1 vmexit
On 2018/01/02 17:47, Liran Alon wrote: On 02/01/18 00:58, Paolo Bonzini wrote: The POSTED_INTR_NV field is constant (though it differs between the vmcs01 and vmcs02), there is no need to reload it on vmexit to L1. Signed-off-by: Paolo Bonzini--- arch/x86/kvm/vmx.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e6223fe8faa1..1e184830a295 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11610,9 +11610,6 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, */ vmx_flush_tlb(vcpu, true); } - /* Restore posted intr vector. */ - if (nested_cpu_has_posted_intr(vmcs12)) - vmcs_write16(POSTED_INTR_NV, POSTED_INTR_VECTOR); vmcs_write32(GUEST_SYSENTER_CS, vmcs12->host_ia32_sysenter_cs); vmcs_writel(GUEST_SYSENTER_ESP, vmcs12->host_ia32_sysenter_esp); Reviewed-by: Liran Alon I would also add to commit message: Fixes: 06a5524f091b ("KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode") Reviewed-by: Quan Xu
Re: [PATCH] KVM: nVMX: remove unnecessary vmwrite from L2->L1 vmexit
On 2018/01/02 17:47, Liran Alon wrote: On 02/01/18 00:58, Paolo Bonzini wrote: The POSTED_INTR_NV field is constant (though it differs between the vmcs01 and vmcs02), there is no need to reload it on vmexit to L1. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e6223fe8faa1..1e184830a295 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11610,9 +11610,6 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, */ vmx_flush_tlb(vcpu, true); } - /* Restore posted intr vector. */ - if (nested_cpu_has_posted_intr(vmcs12)) - vmcs_write16(POSTED_INTR_NV, POSTED_INTR_VECTOR); vmcs_write32(GUEST_SYSENTER_CS, vmcs12->host_ia32_sysenter_cs); vmcs_writel(GUEST_SYSENTER_ESP, vmcs12->host_ia32_sysenter_esp); Reviewed-by: Liran Alon I would also add to commit message: Fixes: 06a5524f091b ("KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode") Reviewed-by: Quan Xu
Business Opportunity
Hello, How are you and your family? Thanks for accepting my connection. I am connecting you due to a Business Opportunity. Should you like to know more about it. Do get back to me so i give you further details. I hope to hear from you soon Regards, MR. YIN LIANCHEN CHIEF INVESTMENT OFFICER CHINA EVERBRIGHT LIMITED. 210 CENTURY CENTER BUILDING, 25th FLOOR, 21 CENTURY AVENUE, PUDONG NEW AREA, SHANGHAI, CHINA
Business Opportunity
Hello, How are you and your family? Thanks for accepting my connection. I am connecting you due to a Business Opportunity. Should you like to know more about it. Do get back to me so i give you further details. I hope to hear from you soon Regards, MR. YIN LIANCHEN CHIEF INVESTMENT OFFICER CHINA EVERBRIGHT LIMITED. 210 CENTURY CENTER BUILDING, 25th FLOOR, 21 CENTURY AVENUE, PUDONG NEW AREA, SHANGHAI, CHINA
Re: [PATCH 16/67] powerpc: rename dma_direct_ to dma_nommu_
Geert Uytterhoevenwrites: > On Tue, Jan 2, 2018 at 10:45 AM, Michael Ellerman wrote: >> Christoph Hellwig writes: >> >>> We want to use the dma_direct_ namespace for a generic implementation, >>> so rename powerpc to the second best choice: dma_nommu_. >> >> I'm not a fan of "nommu". Some of the users of direct ops *are* using an >> IOMMU, they're just setting up a 1:1 mapping once at init time, rather >> than mapping dynamically. >> >> Though I don't have a good idea for a better name, maybe "1to1", >> "linear", "premapped" ? > > "identity"? I think that would be wrong, but thanks for trying to help :) The address on the device side is sometimes (often?) offset from the CPU address. So eg. the device can DMA to RAM address 0x0 using address 0x800. Identity would imply 0 == 0 etc. I think "bijective" is the correct term, but that's probably a bit esoteric. cheers
Re: [PATCH 16/67] powerpc: rename dma_direct_ to dma_nommu_
Geert Uytterhoeven writes: > On Tue, Jan 2, 2018 at 10:45 AM, Michael Ellerman wrote: >> Christoph Hellwig writes: >> >>> We want to use the dma_direct_ namespace for a generic implementation, >>> so rename powerpc to the second best choice: dma_nommu_. >> >> I'm not a fan of "nommu". Some of the users of direct ops *are* using an >> IOMMU, they're just setting up a 1:1 mapping once at init time, rather >> than mapping dynamically. >> >> Though I don't have a good idea for a better name, maybe "1to1", >> "linear", "premapped" ? > > "identity"? I think that would be wrong, but thanks for trying to help :) The address on the device side is sometimes (often?) offset from the CPU address. So eg. the device can DMA to RAM address 0x0 using address 0x800. Identity would imply 0 == 0 etc. I think "bijective" is the correct term, but that's probably a bit esoteric. cheers
Re: general protection fault in copy_verifier_state
On Tue, Jan 02, 2018 at 02:58:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 6bb8824732f69de0f233ae6b1a8158e149627b38 > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+32ac5a3e473f2e01c...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > R10: R11: 0246 R12: > R13: 656c6c616b7a7973 R14: 000e R15: > kasan: CONFIG_KASAN_INLINE enabled > kasan: GPF could be caused by NULL-ptr deref or user memory access > general protection fault: [#1] SMP KASAN > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 3197 Comm: syzkaller425062 Not tainted 4.15.0-rc5+ #170 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:copy_func_state kernel/bpf/verifier.c:403 [inline] > RIP: 0010:copy_verifier_state+0x364/0x590 kernel/bpf/verifier.c:431 > RSP: 0018:8801c7fff130 EFLAGS: 00010203 > RAX: 0070 RBX: dc00 RCX: 0384 > RDX: RSI: 8801c938d800 RDI: 8801c938d800 > RBP: 8801c7fff188 R08: 8801c938d700 R09: 8801c938d700 > R10: R11: R12: 8801c8066940 > R13: 8801c938d700 R14: R15: 8801c938d800 > FS: 01581880() GS:8801db30() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 20a97000 CR3: 0001c839a001 CR4: 001606e0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > pop_stack+0x8c/0x270 kernel/bpf/verifier.c:449 > push_stack kernel/bpf/verifier.c:491 [inline] > check_cond_jmp_op kernel/bpf/verifier.c:3598 [inline] > do_check+0x4b60/0xa050 kernel/bpf/verifier.c:4731 > bpf_check+0x3296/0x58c0 kernel/bpf/verifier.c:5489 > bpf_prog_load+0xa2a/0x1b00 kernel/bpf/syscall.c:1198 > SYSC_bpf kernel/bpf/syscall.c:1807 [inline] > SyS_bpf+0x1044/0x4420 kernel/bpf/syscall.c:1769 > entry_SYSCALL_64_fastpath+0x1f/0x96 > RIP: 0033:0x4404f9 > RSP: 002b:7fff03dc4a48 EFLAGS: 0246 ORIG_RAX: 0141 > RAX: ffda RBX: 0001 RCX: 004404f9 > RDX: 0048 RSI: 20903000 RDI: 0005 > RBP: 000f R08: 0002 R09: 3332 > R10: R11: 0246 R12: > R13: 656c6c616b7a7973 R14: 000e R15: > Code: 4b 8d 3c f7 48 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 05 02 00 00 4f 8b > 34 f7 49 8d 8e 84 03 00 00 48 89 c8 48 89 4d c8 48 c1 e8 03 <0f> b6 14 18 48 > 89 c8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 > RIP: copy_func_state kernel/bpf/verifier.c:403 [inline] RSP: > 8801c7fff130 > RIP: copy_verifier_state+0x364/0x590 kernel/bpf/verifier.c:431 RSP: > 8801c7fff130 > ---[ end trace 18f3ab976ca58c6c ]--- thanks for the report. Looks like it needs this fix: diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 98d8637cf70d..0876d4402dc3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -375,6 +375,8 @@ static int realloc_func_state(struct bpf_func_state *state, int size, static void free_func_state(struct bpf_func_state *state) { + if (!state) + return; kfree(state->stack); kfree(state); } @@ -487,6 +489,8 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, } return >st; err: + free_verifier_state(env->cur_state, true); + env->cur_state = NULL; /* pop all elements and return */ while (!pop_stack(env, NULL, NULL)); return NULL; will submit it properly after few more tests.
Re: general protection fault in copy_verifier_state
On Tue, Jan 02, 2018 at 02:58:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 6bb8824732f69de0f233ae6b1a8158e149627b38 > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+32ac5a3e473f2e01c...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > R10: R11: 0246 R12: > R13: 656c6c616b7a7973 R14: 000e R15: > kasan: CONFIG_KASAN_INLINE enabled > kasan: GPF could be caused by NULL-ptr deref or user memory access > general protection fault: [#1] SMP KASAN > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 3197 Comm: syzkaller425062 Not tainted 4.15.0-rc5+ #170 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:copy_func_state kernel/bpf/verifier.c:403 [inline] > RIP: 0010:copy_verifier_state+0x364/0x590 kernel/bpf/verifier.c:431 > RSP: 0018:8801c7fff130 EFLAGS: 00010203 > RAX: 0070 RBX: dc00 RCX: 0384 > RDX: RSI: 8801c938d800 RDI: 8801c938d800 > RBP: 8801c7fff188 R08: 8801c938d700 R09: 8801c938d700 > R10: R11: R12: 8801c8066940 > R13: 8801c938d700 R14: R15: 8801c938d800 > FS: 01581880() GS:8801db30() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 20a97000 CR3: 0001c839a001 CR4: 001606e0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > pop_stack+0x8c/0x270 kernel/bpf/verifier.c:449 > push_stack kernel/bpf/verifier.c:491 [inline] > check_cond_jmp_op kernel/bpf/verifier.c:3598 [inline] > do_check+0x4b60/0xa050 kernel/bpf/verifier.c:4731 > bpf_check+0x3296/0x58c0 kernel/bpf/verifier.c:5489 > bpf_prog_load+0xa2a/0x1b00 kernel/bpf/syscall.c:1198 > SYSC_bpf kernel/bpf/syscall.c:1807 [inline] > SyS_bpf+0x1044/0x4420 kernel/bpf/syscall.c:1769 > entry_SYSCALL_64_fastpath+0x1f/0x96 > RIP: 0033:0x4404f9 > RSP: 002b:7fff03dc4a48 EFLAGS: 0246 ORIG_RAX: 0141 > RAX: ffda RBX: 0001 RCX: 004404f9 > RDX: 0048 RSI: 20903000 RDI: 0005 > RBP: 000f R08: 0002 R09: 3332 > R10: R11: 0246 R12: > R13: 656c6c616b7a7973 R14: 000e R15: > Code: 4b 8d 3c f7 48 89 f8 48 c1 e8 03 80 3c 18 00 0f 85 05 02 00 00 4f 8b > 34 f7 49 8d 8e 84 03 00 00 48 89 c8 48 89 4d c8 48 c1 e8 03 <0f> b6 14 18 48 > 89 c8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 > RIP: copy_func_state kernel/bpf/verifier.c:403 [inline] RSP: > 8801c7fff130 > RIP: copy_verifier_state+0x364/0x590 kernel/bpf/verifier.c:431 RSP: > 8801c7fff130 > ---[ end trace 18f3ab976ca58c6c ]--- thanks for the report. Looks like it needs this fix: diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 98d8637cf70d..0876d4402dc3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -375,6 +375,8 @@ static int realloc_func_state(struct bpf_func_state *state, int size, static void free_func_state(struct bpf_func_state *state) { + if (!state) + return; kfree(state->stack); kfree(state); } @@ -487,6 +489,8 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, } return >st; err: + free_verifier_state(env->cur_state, true); + env->cur_state = NULL; /* pop all elements and return */ while (!pop_stack(env, NULL, NULL)); return NULL; will submit it properly after few more tests.
Re: [PATCH v3 18/27] pinctrl: replace devm_ioremap_nocache with devm_ioremap
On 2018/1/2 16:43, Linus Walleij wrote: > On Sat, Dec 23, 2017 at 12:00 PM, Yisheng Xiewrote: > >> Default ioremap is ioremap_nocache, so devm_ioremap has the same >> function with devm_ioremap_nocache, which can just be killed to >> save the size of devres.o >> >> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >> which should not have any function change but prepare for killing >> devm_ioremap_nocache. >> >> Cc: Linus Walleij >> Cc: linux-g...@vger.kernel.org >> Signed-off-by: Yisheng Xie > > Patch applied. Well, I list the ARCHs related to the change file, do not include cris,ia64,mn10300 and openrisc, which ioremap is not the same as ioremap_nocache, as discussed in cover letter. So please let me know if I need update the comment. change fileARCH drivers/pinctrl/bcm/pinctrl-ns2-mux.c | 2 +- arm/arm64 drivers/pinctrl/bcm/pinctrl-nsp-mux.c | 4 ++-- arm drivers/pinctrl/freescale/pinctrl-imx1-core.c | 2 +- arm drivers/pinctrl/pinctrl-amd.c | 4 ++-- x86/arm Thanks Yisheng > > Yours, > Linus Walleij > >
Re: [PATCH v3 18/27] pinctrl: replace devm_ioremap_nocache with devm_ioremap
On 2018/1/2 16:43, Linus Walleij wrote: > On Sat, Dec 23, 2017 at 12:00 PM, Yisheng Xie wrote: > >> Default ioremap is ioremap_nocache, so devm_ioremap has the same >> function with devm_ioremap_nocache, which can just be killed to >> save the size of devres.o >> >> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >> which should not have any function change but prepare for killing >> devm_ioremap_nocache. >> >> Cc: Linus Walleij >> Cc: linux-g...@vger.kernel.org >> Signed-off-by: Yisheng Xie > > Patch applied. Well, I list the ARCHs related to the change file, do not include cris,ia64,mn10300 and openrisc, which ioremap is not the same as ioremap_nocache, as discussed in cover letter. So please let me know if I need update the comment. change fileARCH drivers/pinctrl/bcm/pinctrl-ns2-mux.c | 2 +- arm/arm64 drivers/pinctrl/bcm/pinctrl-nsp-mux.c | 4 ++-- arm drivers/pinctrl/freescale/pinctrl-imx1-core.c | 2 +- arm drivers/pinctrl/pinctrl-amd.c | 4 ++-- x86/arm Thanks Yisheng > > Yours, > Linus Walleij > >
Re: [PATCH v2 0/4] Address error and recovery for AER and DPC
On 2018-01-03 00:32, Bjorn Helgaas wrote: On Fri, Dec 29, 2017 at 12:54:15PM +0530, Oza Pawandeep wrote: This patch set brings in support for DPC and AER to co-exist and not to race for recovery. The current implementation of AER and error message broadcasting to the EP driver is tightly coupled and limited to AER service driver. It is important to factor out broadcasting and other link handling callbacks. So that not only when AER gets triggered, but also when DPC get triggered, or both get triggered simultaneously (for e.g. ERR_FATAL), callbacks are handled appropriately. having modularized the code, the race between AER and DPC is handled gracefully. for e.g. when DPC is active and kicked in, AER should not attempt to do recovery, because DPC takes care of it. High-level question: We have some convoluted code in negotiate_os_control() and aer_service_init() that (I think) essentially disables AER unless the platform firmware grants us permission to use it. The last implementation note in PCIe r3.1, sec 6.2.10 says DPC may be controlled in some configurations by platform firmware and in other configurations by the operating system. DPC functionality is strongly linked with the functionality in Advanced Error Reporting. To avoid conflicts over whether platform firmware or the operating system have control of DPC, it is recommended that platform firmware and operating systems always link the control of DPC to the control of Advanced Error Reporting. I read that as suggesting that we should enable DPC support in Linux if and only if we also enable AER. But I don't see anything in DPC that looks like that. Should there be something there? Should DPC be restructured so it's enabled and handled inside the AER driver instead of being a separate driver? Bjorn The whole idea of factoring out error handing and plug it back to DPC is to enable DPC is participate synchronously in pcie_port_service_driver hooks. AER and DPC both being port service driver, it makes more sense, for DPC to be able to do with those callbacks as much as AER is able to do with those callbacks currently. but those callbacks are tightly coupled with AER driver. that way DPC and AER can act independently in their own space, by gaining more control. and if needed, both can synchronize the callbacks. Regards, Oza.
Re: [PATCH v2 0/4] Address error and recovery for AER and DPC
On 2018-01-03 00:32, Bjorn Helgaas wrote: On Fri, Dec 29, 2017 at 12:54:15PM +0530, Oza Pawandeep wrote: This patch set brings in support for DPC and AER to co-exist and not to race for recovery. The current implementation of AER and error message broadcasting to the EP driver is tightly coupled and limited to AER service driver. It is important to factor out broadcasting and other link handling callbacks. So that not only when AER gets triggered, but also when DPC get triggered, or both get triggered simultaneously (for e.g. ERR_FATAL), callbacks are handled appropriately. having modularized the code, the race between AER and DPC is handled gracefully. for e.g. when DPC is active and kicked in, AER should not attempt to do recovery, because DPC takes care of it. High-level question: We have some convoluted code in negotiate_os_control() and aer_service_init() that (I think) essentially disables AER unless the platform firmware grants us permission to use it. The last implementation note in PCIe r3.1, sec 6.2.10 says DPC may be controlled in some configurations by platform firmware and in other configurations by the operating system. DPC functionality is strongly linked with the functionality in Advanced Error Reporting. To avoid conflicts over whether platform firmware or the operating system have control of DPC, it is recommended that platform firmware and operating systems always link the control of DPC to the control of Advanced Error Reporting. I read that as suggesting that we should enable DPC support in Linux if and only if we also enable AER. But I don't see anything in DPC that looks like that. Should there be something there? Should DPC be restructured so it's enabled and handled inside the AER driver instead of being a separate driver? Bjorn The whole idea of factoring out error handing and plug it back to DPC is to enable DPC is participate synchronously in pcie_port_service_driver hooks. AER and DPC both being port service driver, it makes more sense, for DPC to be able to do with those callbacks as much as AER is able to do with those callbacks currently. but those callbacks are tightly coupled with AER driver. that way DPC and AER can act independently in their own space, by gaining more control. and if needed, both can synchronize the callbacks. Regards, Oza.
RE: [LINUX PATCH 3/4] dmaengine: xilinx_dma: Fix compilation warning
Hi Vinod, >On Wed, Jan 03, 2018 at 05:13:29AM +, Appana Durga Kedareswara Rao >wrote: >> Hi Vinod, >> >> Thanks for the review... >> >> > >> >On Thu, Dec 21, 2017 at 03:41:37PM +0530, Kedareswara rao Appana wrote: >> > >> >Fix title here too >> >> Sure will fix in v2... >> >> > >> >BTW whats with LINUX tag in patches, pls drop them >> >> Ok will mention the Linux tag info in the cover letter patch from the >> next patch series on wards... > >Please wrap your replies within 80chars. It is very hard to read! I have >reflown for >readability Sure will take care of it next time onwards... > >Can you explain what you mean by that info, what are you trying to convey? What I mean here is will mention the Linux kernel tag Information in the cover letter patch... Regards, Kedar. > >-- >~Vinod
RE: [LINUX PATCH 3/4] dmaengine: xilinx_dma: Fix compilation warning
Hi Vinod, >On Wed, Jan 03, 2018 at 05:13:29AM +, Appana Durga Kedareswara Rao >wrote: >> Hi Vinod, >> >> Thanks for the review... >> >> > >> >On Thu, Dec 21, 2017 at 03:41:37PM +0530, Kedareswara rao Appana wrote: >> > >> >Fix title here too >> >> Sure will fix in v2... >> >> > >> >BTW whats with LINUX tag in patches, pls drop them >> >> Ok will mention the Linux tag info in the cover letter patch from the >> next patch series on wards... > >Please wrap your replies within 80chars. It is very hard to read! I have >reflown for >readability Sure will take care of it next time onwards... > >Can you explain what you mean by that info, what are you trying to convey? What I mean here is will mention the Linux kernel tag Information in the cover letter patch... Regards, Kedar. > >-- >~Vinod
[PATCH] iommu/of: Only do IOMMU lookup for available ones
The for_each_matching_node_and_match() would return every matching nodes including unavailable ones. It's pointless to init unavailable IOMMUs, so add a sanity check to avoid that. Signed-off-by: Jeffy Chen--- drivers/iommu/of_iommu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 50947ebb6d17..6f7456caa30d 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -240,6 +240,9 @@ static int __init of_iommu_init(void) for_each_matching_node_and_match(np, matches, ) { const of_iommu_init_fn init_fn = match->data; + if (!of_device_is_available(np)) + continue; + if (init_fn && init_fn(np)) pr_err("Failed to initialise IOMMU %pOF\n", np); } -- 2.11.0
[PATCH] iommu/of: Only do IOMMU lookup for available ones
The for_each_matching_node_and_match() would return every matching nodes including unavailable ones. It's pointless to init unavailable IOMMUs, so add a sanity check to avoid that. Signed-off-by: Jeffy Chen --- drivers/iommu/of_iommu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 50947ebb6d17..6f7456caa30d 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -240,6 +240,9 @@ static int __init of_iommu_init(void) for_each_matching_node_and_match(np, matches, ) { const of_iommu_init_fn init_fn = match->data; + if (!of_device_is_available(np)) + continue; + if (init_fn && init_fn(np)) pr_err("Failed to initialise IOMMU %pOF\n", np); } -- 2.11.0
Re: [PATCH v3 06/27] gpio: replace devm_ioremap_nocache with devm_ioremap
On 2018/1/2 16:41, Linus Walleij wrote: > On Sat, Dec 23, 2017 at 11:58 AM, Yisheng Xiewrote: > >> Default ioremap is ioremap_nocache, so devm_ioremap has the same >> function with devm_ioremap_nocache, which can just be killed to >> save the size of devres.o >> >> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >> which should not have any function change but prepare for killing >> devm_ioremap_nocache. >> >> Cc: Linus Walleij >> Cc: linux-g...@vger.kernel.org >> Signed-off-by: Yisheng Xie Well, I list the ARCHs related to the change file, do not include cris,ia64,mn10300 and openrisc, which ioremap is not the same as ioremap_nocache, as discussed in cover letter. So please let me know if I need update the comment. change_fileARCH drivers/gpio/gpio-ath79.c | 3 +-- mips drivers/gpio/gpio-em.c| 6 ++ arm drivers/gpio/gpio-htc-egpio.c | 4 ++-- arm drivers/gpio/gpio-xgene.c | 3 +-- arm64 Thanks Yisheng > > Patch applied. > > Yours, > Linus Walleij > >
Re: [PATCH v3 06/27] gpio: replace devm_ioremap_nocache with devm_ioremap
On 2018/1/2 16:41, Linus Walleij wrote: > On Sat, Dec 23, 2017 at 11:58 AM, Yisheng Xie wrote: > >> Default ioremap is ioremap_nocache, so devm_ioremap has the same >> function with devm_ioremap_nocache, which can just be killed to >> save the size of devres.o >> >> This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >> which should not have any function change but prepare for killing >> devm_ioremap_nocache. >> >> Cc: Linus Walleij >> Cc: linux-g...@vger.kernel.org >> Signed-off-by: Yisheng Xie Well, I list the ARCHs related to the change file, do not include cris,ia64,mn10300 and openrisc, which ioremap is not the same as ioremap_nocache, as discussed in cover letter. So please let me know if I need update the comment. change_fileARCH drivers/gpio/gpio-ath79.c | 3 +-- mips drivers/gpio/gpio-em.c| 6 ++ arm drivers/gpio/gpio-htc-egpio.c | 4 ++-- arm drivers/gpio/gpio-xgene.c | 3 +-- arm64 Thanks Yisheng > > Patch applied. > > Yours, > Linus Walleij > >
linux-next: Tree for Jan 3
Hi all, Changes since 20180102: The clk tree lost its build failure. The kvm-arm tree gained a conflict against Linus' tree. Non-merge commits (relative to Linus' tree): 6587 6916 files changed, 273638 insertions(+), 194470 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 255 trees (counting Linus' and 43 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (30a7acd57389 Linux 4.15-rc6) Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi) Merging kbuild-current/fixes (cfe17c9bbe6a kbuild: move cc-option and cc-disable-warning after incl. arch Makefile) Merging arc-current/for-curr (d3b388559fac ARC: handle gcc generated __builtin_trap for older compiler) Merging arm-current/fixes (36b0cb84ee85 ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch) Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs for v4.14-rc7) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (7333b5aca412 KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()) Merging sparc/master (59585b4be9ae sparc64: repair calling incorrect hweight function from stubs) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (bd30ffc414e5 NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625) Merging bpf/master (2758b3e3e630 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging ipsec/master (2f10a61cee8f xfrm: fix rcu usage in xfrm_get_type_offload) Merging netfilter/master (8bea728dce89 netfilter: nf_tables: fix potential NULL-ptr deref in nf_tables_dump_obj_done()) Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set) Merging wireless-drivers/master (a41886f56b7b Merge tag 'iwlwifi-for-kalle-2017-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes) Merging mac80211/master (04a7279ff12f cfg80211: ship certificates as hex files) Merging sound-current/for-linus (fe08f34d066f ALSA: pcm: Remove incorrect snd_BUG_ON() usages) Merging pci-current/for-linus (1291a0d5049d Linux 4.15-rc4) Merging driver-core.current/driver-core-linus (30a7acd57389 Linux 4.15-rc6) Merging tty.current/tty-linus (30a7acd57389 Linux 4.15-rc6) Merging usb.current/usb-linus (30a7acd57389 Linux 4.15-rc6) Merging usb-gadget-fixes/fixes (1291a0d5049d Linux 4.15-rc4) Merging usb-serial-fixes/usb-linus (4307413256ac USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ) Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: fix ulpi-node lookup) Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON) Merging staging.current/staging-linus (30a7acd57389 Linux 4.15-rc6) Merging char-misc.current/char-misc-linus (30a7acd57389 Linux 4.15-rc6) Merging input-current/for-linus (8b7e9d9e2d8b Input: hideep - fix compile error due to missing include file) Merging crypto-current/master (2973633e9f09 crypto: inside-secure - do not use areq->result for partial results) Merging ide/master (0c86a6bd85ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging vfio-fixes/for-linus (e4
linux-next: Tree for Jan 3
Hi all, Changes since 20180102: The clk tree lost its build failure. The kvm-arm tree gained a conflict against Linus' tree. Non-merge commits (relative to Linus' tree): 6587 6916 files changed, 273638 insertions(+), 194470 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 255 trees (counting Linus' and 43 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (30a7acd57389 Linux 4.15-rc6) Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi) Merging kbuild-current/fixes (cfe17c9bbe6a kbuild: move cc-option and cc-disable-warning after incl. arch Makefile) Merging arc-current/for-curr (d3b388559fac ARC: handle gcc generated __builtin_trap for older compiler) Merging arm-current/fixes (36b0cb84ee85 ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch) Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs for v4.14-rc7) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (7333b5aca412 KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()) Merging sparc/master (59585b4be9ae sparc64: repair calling incorrect hweight function from stubs) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (bd30ffc414e5 NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625) Merging bpf/master (2758b3e3e630 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging ipsec/master (2f10a61cee8f xfrm: fix rcu usage in xfrm_get_type_offload) Merging netfilter/master (8bea728dce89 netfilter: nf_tables: fix potential NULL-ptr deref in nf_tables_dump_obj_done()) Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set) Merging wireless-drivers/master (a41886f56b7b Merge tag 'iwlwifi-for-kalle-2017-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes) Merging mac80211/master (04a7279ff12f cfg80211: ship certificates as hex files) Merging sound-current/for-linus (fe08f34d066f ALSA: pcm: Remove incorrect snd_BUG_ON() usages) Merging pci-current/for-linus (1291a0d5049d Linux 4.15-rc4) Merging driver-core.current/driver-core-linus (30a7acd57389 Linux 4.15-rc6) Merging tty.current/tty-linus (30a7acd57389 Linux 4.15-rc6) Merging usb.current/usb-linus (30a7acd57389 Linux 4.15-rc6) Merging usb-gadget-fixes/fixes (1291a0d5049d Linux 4.15-rc4) Merging usb-serial-fixes/usb-linus (4307413256ac USB: serial: cp210x: add IDs for LifeScan OneTouch Verio IQ) Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: fix ulpi-node lookup) Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON) Merging staging.current/staging-linus (30a7acd57389 Linux 4.15-rc6) Merging char-misc.current/char-misc-linus (30a7acd57389 Linux 4.15-rc6) Merging input-current/for-linus (8b7e9d9e2d8b Input: hideep - fix compile error due to missing include file) Merging crypto-current/master (2973633e9f09 crypto: inside-secure - do not use areq->result for partial results) Merging ide/master (0c86a6bd85ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net) Merging vfio-fixes/for-linus (e4
Re: [ANNOUNCE] Git v2.16.0-rc0
Bryan Turner wrote: > On Tue, Jan 2, 2018 at 9:07 PM, Jonathan Niederwrote: >> So my first question is why the basename detection is not working for >> you. What value of GIT_SSH, GIT_SSH_COMMAND, or core.sshCommand are >> you using? > > So I'd been digging further into this for the last hour because I > wasn't seeing quite the behavior I was expecting when I ran Git from > the command line on Ubuntu 12.04 or 14.04, and this nudged me to the > right answer: We're setting GIT_SSH to a wrapper script. In our case, > that wrapper script is just calling OpenSSH's ssh with all the > provided arguments (plus a couple extra ones), but because we're > setting GIT_SSH at all, that's why the auto variant code is running. > That being the case, explicitly setting GIT_SSH_VARIANT=ssh may be the > correct thing to do, to tell Git that we want to be treated like > "normal" OpenSSH, as opposed to expecting Git to assume we behave like > OpenSSH (when the Android repo use case clearly shows that assumption > also doesn't hold). Ah, that's a comfort. Setting GIT_SSH_VARIANT would avoid this autodetection code and is the recommended thing to do. That said, we can't go back in time and update everyone's tools to do that (e.g. there is not even a release of repo with [1] out yet), so this is still considered a regression and I'm glad you found it. Jonathan [1] https://gerrit-review.googlesource.com/c/git-repo/+/134950
Re: [ANNOUNCE] Git v2.16.0-rc0
Bryan Turner wrote: > On Tue, Jan 2, 2018 at 9:07 PM, Jonathan Nieder wrote: >> So my first question is why the basename detection is not working for >> you. What value of GIT_SSH, GIT_SSH_COMMAND, or core.sshCommand are >> you using? > > So I'd been digging further into this for the last hour because I > wasn't seeing quite the behavior I was expecting when I ran Git from > the command line on Ubuntu 12.04 or 14.04, and this nudged me to the > right answer: We're setting GIT_SSH to a wrapper script. In our case, > that wrapper script is just calling OpenSSH's ssh with all the > provided arguments (plus a couple extra ones), but because we're > setting GIT_SSH at all, that's why the auto variant code is running. > That being the case, explicitly setting GIT_SSH_VARIANT=ssh may be the > correct thing to do, to tell Git that we want to be treated like > "normal" OpenSSH, as opposed to expecting Git to assume we behave like > OpenSSH (when the Android repo use case clearly shows that assumption > also doesn't hold). Ah, that's a comfort. Setting GIT_SSH_VARIANT would avoid this autodetection code and is the recommended thing to do. That said, we can't go back in time and update everyone's tools to do that (e.g. there is not even a release of repo with [1] out yet), so this is still considered a regression and I'm glad you found it. Jonathan [1] https://gerrit-review.googlesource.com/c/git-repo/+/134950
Re: About the try to remove cross-release feature entirely by Ingo
On 1/3/2018 11:58 AM, Dave Chinner wrote: On Wed, Jan 03, 2018 at 11:28:44AM +0900, Byungchul Park wrote: On 1/1/2018 7:18 PM, Matthew Wilcox wrote: On Sat, Dec 30, 2017 at 06:00:57PM -0500, Theodore Ts'o wrote: Also, what to do with TCP connections which are created in userspace (with some authentication exchanges happening in userspace), and then passed into kernel space for use in kernel space, is an interesting question. Yes! I'd love to have a lockdep expert weigh in here. I believe it's legitimate to change a lock's class after it's been used, essentially destroying it and reinitialising it. If not, it should be because it's a reasonable design for an object to need different lock classes for different phases of its existance. I also think it should be done ultimately. And I think it's very much hard since it requires to change the dependency graph of lockdep but anyway possible. It's up to lockdep maintainer's will though.. We used to do this in XFS to work around the fact that the memory reclaim context "locks" were too stupid to understand that an object referenced and locked above memory allocation could not be accessed below in memory reclaim because memory reclaim only accesses /unreferenced objects/. We played whack-a-mole with lockdep for years to get most of the false positives sorted out. Hence for a long time we had to re-initialise the lock context for the XFS inode iolock in ->evict_inode() so we could lock it for reclaim processing. Eventually we ended up completely reworking the inode reclaim locking in XFS primarily to get rid of all the nasty lockdep hacks we had strewn throughout the code. It was ~2012 we got rid of the last inode re-init code, IIRC. Yeah: commit 4f59af758f9092bc7b266ca919ce6067170e5172 Author: Christoph HellwigDate: Wed Jul 4 11:13:33 2012 -0400 xfs: remove iolock lock classes Now that we never take the iolock during inode reclaim we don't need to play games with lock classes. Signed-off-by: Christoph Hellwig Reviewed-by: Rich Johnston Signed-off-by: Ben Myers We still have problems with lockdep false positives w.r.t. memory allocation contexts, mainly with code that can be called from both above and below memory allocation contexts. We've finally got __GFP_NOLOCKDEP to be able to annotate memory allocation points within such code paths, but that doesn't help with locks Byungchul, lockdep has a long, long history of having sharp edges and being very unfriendly to developers. We've all been scarred by lockdep at one time or another and so there's a fair bit of resistance to repeating past mistakes and allowing lockdep to inflict more scars on us As I understand what you suffered from.. I don't really want to force it forward strongly. So far, all problems have been handled by myself including the final one e.i. the completion in submit_bio_wait() with the invalidation if it's allowed. But yes, who knows the future? In the future, that terrible thing you mentioned might or might not happen because of cross-release. I just felt like someone was misunderstanding what the problem came from, what the problem was, how we could avoid it, why cross-release should be removed and so on.. I believe the 3 ways I suggested can help, but I don't want to strongly insist if all of you don't think so. Thanks a lot anyway for your opinion. -- Thanks, Byungchul
Re: About the try to remove cross-release feature entirely by Ingo
On 1/3/2018 11:58 AM, Dave Chinner wrote: On Wed, Jan 03, 2018 at 11:28:44AM +0900, Byungchul Park wrote: On 1/1/2018 7:18 PM, Matthew Wilcox wrote: On Sat, Dec 30, 2017 at 06:00:57PM -0500, Theodore Ts'o wrote: Also, what to do with TCP connections which are created in userspace (with some authentication exchanges happening in userspace), and then passed into kernel space for use in kernel space, is an interesting question. Yes! I'd love to have a lockdep expert weigh in here. I believe it's legitimate to change a lock's class after it's been used, essentially destroying it and reinitialising it. If not, it should be because it's a reasonable design for an object to need different lock classes for different phases of its existance. I also think it should be done ultimately. And I think it's very much hard since it requires to change the dependency graph of lockdep but anyway possible. It's up to lockdep maintainer's will though.. We used to do this in XFS to work around the fact that the memory reclaim context "locks" were too stupid to understand that an object referenced and locked above memory allocation could not be accessed below in memory reclaim because memory reclaim only accesses /unreferenced objects/. We played whack-a-mole with lockdep for years to get most of the false positives sorted out. Hence for a long time we had to re-initialise the lock context for the XFS inode iolock in ->evict_inode() so we could lock it for reclaim processing. Eventually we ended up completely reworking the inode reclaim locking in XFS primarily to get rid of all the nasty lockdep hacks we had strewn throughout the code. It was ~2012 we got rid of the last inode re-init code, IIRC. Yeah: commit 4f59af758f9092bc7b266ca919ce6067170e5172 Author: Christoph Hellwig Date: Wed Jul 4 11:13:33 2012 -0400 xfs: remove iolock lock classes Now that we never take the iolock during inode reclaim we don't need to play games with lock classes. Signed-off-by: Christoph Hellwig Reviewed-by: Rich Johnston Signed-off-by: Ben Myers We still have problems with lockdep false positives w.r.t. memory allocation contexts, mainly with code that can be called from both above and below memory allocation contexts. We've finally got __GFP_NOLOCKDEP to be able to annotate memory allocation points within such code paths, but that doesn't help with locks Byungchul, lockdep has a long, long history of having sharp edges and being very unfriendly to developers. We've all been scarred by lockdep at one time or another and so there's a fair bit of resistance to repeating past mistakes and allowing lockdep to inflict more scars on us As I understand what you suffered from.. I don't really want to force it forward strongly. So far, all problems have been handled by myself including the final one e.i. the completion in submit_bio_wait() with the invalidation if it's allowed. But yes, who knows the future? In the future, that terrible thing you mentioned might or might not happen because of cross-release. I just felt like someone was misunderstanding what the problem came from, what the problem was, how we could avoid it, why cross-release should be removed and so on.. I believe the 3 ways I suggested can help, but I don't want to strongly insist if all of you don't think so. Thanks a lot anyway for your opinion. -- Thanks, Byungchul
Re: [PATCH 10/13] ocxl: Add Makefile and Kconfig
On 19/12/17 02:21, Frederic Barrat wrote: OCXL_BASE triggers the platform support needed by the driver. Signed-off-by: Frederic Barrat--- drivers/misc/Kconfig | 1 + drivers/misc/Makefile | 1 + drivers/misc/ocxl/Kconfig | 25 + drivers/misc/ocxl/Makefile | 10 ++ 4 files changed, 37 insertions(+) create mode 100644 drivers/misc/ocxl/Kconfig create mode 100644 drivers/misc/ocxl/Makefile diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index f1a5c2357b14..0534f338c84a 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -508,4 +508,5 @@ source "drivers/misc/mic/Kconfig" source "drivers/misc/genwqe/Kconfig" source "drivers/misc/echo/Kconfig" source "drivers/misc/cxl/Kconfig" +source "drivers/misc/ocxl/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index 5ca5f64df478..73326d54e246 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -55,6 +55,7 @@ obj-$(CONFIG_CXL_BASE)+= cxl/ obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o obj-$(CONFIG_ASPEED_LPC_SNOOP)+= aspeed-lpc-snoop.o obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o +obj-$(CONFIG_OCXL) += ocxl/ lkdtm-$(CONFIG_LKDTM) += lkdtm_core.o lkdtm-$(CONFIG_LKDTM) += lkdtm_bugs.o diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig new file mode 100644 index ..4496b61f48db --- /dev/null +++ b/drivers/misc/ocxl/Kconfig @@ -0,0 +1,25 @@ +# +# Open Coherent Accelerator (OCXL) compatible devices +# + +config OCXL_BASE + bool + default n + select PPC_COPRO_BASE + +config OCXL + tristate "Support for Open Coherent Accelerators (OCXL)" + depends on PPC_POWERNV && PCI && EEH + select OCXL_BASE + default m + help + + Select this option to enable driver support for Open + Coherent Accelerators (OCXL). OCXL is otherwise known as + Open Coherent Accelerator Processor Interface (OCAPI). + OCAPI allows accelerators in FPGAs to be coherently attached + to a CPU through a Open CAPI link. This driver enables + userspace programs to access these accelerators through + devices found in /dev/ocxl/ I'd prefer more consistency in how we refer to OpenCAPI. "ocxl" is a driver name that we have purely for historical reasons, it's not really the name of anything else. I know throughout the various specs and code, we use "OCAPI" a lot, but that's not really an abbreviation that should be "user-facing". Something like: config OCXL tristate "OpenCAPI coherent accelerator support" help Select this option to enable the ocxl driver for Open Coherent Accelerator Processor Interface (OpenCAPI) devices. OpenCAPI allows FPGA and ASIC accelerators to be coherently attached to a CPU over an OpenCAPI link. The ocxl driver enables userspace programs to access these accelerators through devices in /dev/ocxl/. For more information, see http://opencapi.org. If unsure, say N. + + If unsure, say N. diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile new file mode 100644 index ..f75853411cfd --- /dev/null +++ b/drivers/misc/ocxl/Makefile @@ -0,0 +1,10 @@ +ccflags-$(CONFIG_PPC_WERROR) += -Werror + +ocxl-y += main.o pci.o config.o file.o pasid.o +ocxl-y += link.o context.o afu_irq.o sysfs.o trace.o +obj-$(CONFIG_OCXL) += ocxl.o + +# For tracepoints to include our trace.h from tracepoint infrastructure: +CFLAGS_trace.o := -I$(src) + +# ccflags-y += -DDEBUG -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [PATCH 10/13] ocxl: Add Makefile and Kconfig
On 19/12/17 02:21, Frederic Barrat wrote: OCXL_BASE triggers the platform support needed by the driver. Signed-off-by: Frederic Barrat --- drivers/misc/Kconfig | 1 + drivers/misc/Makefile | 1 + drivers/misc/ocxl/Kconfig | 25 + drivers/misc/ocxl/Makefile | 10 ++ 4 files changed, 37 insertions(+) create mode 100644 drivers/misc/ocxl/Kconfig create mode 100644 drivers/misc/ocxl/Makefile diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index f1a5c2357b14..0534f338c84a 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -508,4 +508,5 @@ source "drivers/misc/mic/Kconfig" source "drivers/misc/genwqe/Kconfig" source "drivers/misc/echo/Kconfig" source "drivers/misc/cxl/Kconfig" +source "drivers/misc/ocxl/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index 5ca5f64df478..73326d54e246 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -55,6 +55,7 @@ obj-$(CONFIG_CXL_BASE)+= cxl/ obj-$(CONFIG_ASPEED_LPC_CTRL) += aspeed-lpc-ctrl.o obj-$(CONFIG_ASPEED_LPC_SNOOP)+= aspeed-lpc-snoop.o obj-$(CONFIG_PCI_ENDPOINT_TEST) += pci_endpoint_test.o +obj-$(CONFIG_OCXL) += ocxl/ lkdtm-$(CONFIG_LKDTM) += lkdtm_core.o lkdtm-$(CONFIG_LKDTM) += lkdtm_bugs.o diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig new file mode 100644 index ..4496b61f48db --- /dev/null +++ b/drivers/misc/ocxl/Kconfig @@ -0,0 +1,25 @@ +# +# Open Coherent Accelerator (OCXL) compatible devices +# + +config OCXL_BASE + bool + default n + select PPC_COPRO_BASE + +config OCXL + tristate "Support for Open Coherent Accelerators (OCXL)" + depends on PPC_POWERNV && PCI && EEH + select OCXL_BASE + default m + help + + Select this option to enable driver support for Open + Coherent Accelerators (OCXL). OCXL is otherwise known as + Open Coherent Accelerator Processor Interface (OCAPI). + OCAPI allows accelerators in FPGAs to be coherently attached + to a CPU through a Open CAPI link. This driver enables + userspace programs to access these accelerators through + devices found in /dev/ocxl/ I'd prefer more consistency in how we refer to OpenCAPI. "ocxl" is a driver name that we have purely for historical reasons, it's not really the name of anything else. I know throughout the various specs and code, we use "OCAPI" a lot, but that's not really an abbreviation that should be "user-facing". Something like: config OCXL tristate "OpenCAPI coherent accelerator support" help Select this option to enable the ocxl driver for Open Coherent Accelerator Processor Interface (OpenCAPI) devices. OpenCAPI allows FPGA and ASIC accelerators to be coherently attached to a CPU over an OpenCAPI link. The ocxl driver enables userspace programs to access these accelerators through devices in /dev/ocxl/. For more information, see http://opencapi.org. If unsure, say N. + + If unsure, say N. diff --git a/drivers/misc/ocxl/Makefile b/drivers/misc/ocxl/Makefile new file mode 100644 index ..f75853411cfd --- /dev/null +++ b/drivers/misc/ocxl/Makefile @@ -0,0 +1,10 @@ +ccflags-$(CONFIG_PPC_WERROR) += -Werror + +ocxl-y += main.o pci.o config.o file.o pasid.o +ocxl-y += link.o context.o afu_irq.o sysfs.o trace.o +obj-$(CONFIG_OCXL) += ocxl.o + +# For tracepoints to include our trace.h from tracepoint infrastructure: +CFLAGS_trace.o := -I$(src) + +# ccflags-y += -DDEBUG -- Andrew Donnellan OzLabs, ADL Canberra andrew.donnel...@au1.ibm.com IBM Australia Limited
Re: [PATCH 02/11] clk: sunxi-ng: a83t: Add M divider to TCON1 clock
On Sun, Dec 31, 2017 at 5:01 AM, Jernej Skrabecwrote: > TCON1 also has M divider, contrary to TCON0. > > Fixes: 05359be1176b ("clk: sunxi-ng: Add driver for A83T CCU") > > Signed-off-by: Jernej Skrabec Added "And the mux is only 2 bits wide, instead of 3." to the commit message and applied. ChenYu
Re: [PATCH 02/11] clk: sunxi-ng: a83t: Add M divider to TCON1 clock
On Sun, Dec 31, 2017 at 5:01 AM, Jernej Skrabec wrote: > TCON1 also has M divider, contrary to TCON0. > > Fixes: 05359be1176b ("clk: sunxi-ng: Add driver for A83T CCU") > > Signed-off-by: Jernej Skrabec Added "And the mux is only 2 bits wide, instead of 3." to the commit message and applied. ChenYu