from:"Jeff Mahoney"

Re: [PATCH] mod_devicetable.h: reduce sizeof(struct of_device_id) by 80 bytes

2019-05-02 Thread Jeff Mahoney


On 5/2/19 5:41 AM, Rasmus Villemoes wrote:

On 26/04/2019 11.27, Arnd Bergmann wrote:

On Thu, Apr 25, 2019 at 10:31 PM Rasmus Villemoes
 wrote:


For an arm imx_v6_v7_defconfig kernel, .rodata becomes 70K smaller;
.init.data shrinks by another ~13K, making the whole kernel image
about 83K, or 0.3%, smaller.

Signed-off-by: Rasmus Villemoes 


The space savings are nice, but I wonder if the format of these
structures is part of the ABI or not. I have some vague recollection
of that, but it's possible that it's no longer true in this century.

scripts/mod/file2alias.c processes the structures into a different
format and seems to be written specifically to avoid problems
with changes like the one you did. Can anyone confirm that
this is true before we apply the patch?


I can't confirm it, of course, but I did do some digging around and
couldn't find anything other than file2alias, which as you mention is
prepared for such a change. I also couldn't find any specific reason for
the 128 (it's not a #define, so at least originally it didn't seem to be
tied to some external consumer) - Jeff, do you remember why you chose
that back when you did 5e6557722e69?


I had been wondering why I'd been included on this thread.  I completely 
forgot that I wrote this code nearly 15 years ago. :)


It was probably as simple as there not being a real limit for how long 
the compatible string could be and wanting to make it flexible.  I was 
targetting a powerpc mac notebook I had at the time -- not tight memory 
embedded systems, so sorry for that.



But we cannot really know whether there is some userspace tool that
parses the .ko ELF objects the same way that file2alias does, doing
pattern matching on the symbol names etc. I cannot see why anybody would
_do_ that (the in-tree infrastructure already generates the
MODULE_ALIAS() from which modules.alias gets generated), but the only
way of knowing, I think, is to try to apply the patch and see if anybody
complains.


The size is part of the ABI, though.  module-init-tools has a copy of 
the same struct and uses that size to walk an array of of_device_id when 
a module as more than one.  If you shrink it, that will certainly break.


file2alias does the right things only because it's tightly coupled to 
the kernel version it's being used with.  It still directly accesses the 
structure definitions in the headers.


-Jeff

--
Jeff Mahoney
SUSE Labs

Re: [RFC] [PATCH 0/5] procfs: reduce duplication by using symlinks

2019-03-23 Thread Jeff Mahoney

On 3/23/19 11:56 AM, Eric W. Biederman wrote:
> Jeff Mahoney  writes:
> 
>> On 4/24/18 10:14 AM, Eric W. Biederman wrote:
>>> je...@suse.com writes:
>>>
>>>> From: Jeff Mahoney 
>>>>
>>>> Hi all -
>>>>
>>>> I recently encountered a customer issue where, on a machine with many TiB
>>>> of memory and a few hundred cores, after a task with a few thousand threads
>>>> and hundreds of files open exited, the system would softlockup.  That
>>>> issue was (is still) being addressed by Nik Borisov's patch to add a
>>>> cond_resched call to shrink_dentry_list.  The underlying issue is still
>>>> there, though.  We just don't complain as loudly.  When a huge task
>>>> exits, now the system is more or less unresponsive for about eight
>>>> minutes.  All CPUs are pinned and every one of them is going through
>>>> dentry and inode eviction for the procfs files associated with each
>>>> thread.  It's made worse by every CPU contending on the super's
>>>> inode list lock.
>>>>
>>>> The numbers get big.  My test case was 4096 threads with 16384 files
>>>> open.  It's a contrived example, but not that far off from the actual
>>>> customer case.  In this case, a simple "find /proc" would create around
>>>> 300 million dentry/inode pairs.  More practically, lsof(1) does it too,
>>>> it just takes longer.  On smaller systems, memory pressure starts pushing
>>>> them out. Memory pressure isn't really an issue on this machine, so we
>>>> end up using well over 100GB for proc files.  It's the combination of
>>>> the wasted CPU cycles in teardown and the wasted memory at runtime that
>>>> pushed me to take this approach.
>>>>
>>>> The biggest culprit is the "fd" and "fdinfo" directories, but those are
>>>> made worse by there being multiple copies of them even for the same
>>>> task without threads getting involved:
>>>>
>>>> - /proc/pid/fd and /proc/pid/task/pid/fd are identical but share no
>>>>   resources.
>>>>
>>>> - Every /proc/pid/task/*/fd directory in a thread group has identical
>>>>   contents (unless unshare(CLONE_FILES) was called), but share no
>>>>   resources.
>>>>
>>>> - If we do a lookup like /proc/pid/fd on a member of a thread group,
>>>>   we'll get a valid directory.  Inside, there will be a complete
>>>>   copy of /proc/pid/task/* just like in /proc/tgid/task.  Again,
>>>>   nothing is shared.
>>>>
>>>> This patch set reduces some (most) of the duplication by conditionally
>>>> replacing some of the directories with symbolic links to copies that are
>>>> identical.
>>>>
>>>> 1) Eliminate the duplication of the task directories between threads.
>>>>The task directory belongs to the thread leader and the threads
>>>>link to it: e.g. /proc/915/task -> ../910/task  This mainly
>>>>reduces duplication when individual threads are looked up directly
>>>>at the tgid level.  The impact varies based on the number of threads.
>>>>The user has to go out of their way in order to mess up their system
>>>>in this way.  But if they were so inclined, they could create ~550
>>>>billion inodes and dentries using the test case.
>>>>
>>>> 2) Eliminate the duplication of directories that are created identically
>>>>between the tgid-level pid directory and its task directory: fd,
>>>>fdinfo, ns, net, attr.  There is obviously more duplication between
>>>>the two directories, but replacing a file with a symbolic link
>>>>doesn't get us anything.  This reduces the number of files associated
>>>>with fd and fdinfo by half if threads aren't involved.
>>>>
>>>> 3) Eliminate the duplication of fd and fdinfo directories among threads
>>>>that share a files_struct.  We check at directory creation time if
>>>>the task is a group leader and if not, whether it shares ->files with
>>>>the group leader.  If so, we create a symbolic link to ../tgid/fd*.
>>>>We use a d_revalidate callback to check whether the thread has called
>>>>unshare(CLONE_FILES) and, if so, fail the revalidation for the symlink.
>>>>Upon re-lookup, a directory will be created in its place.  This is
>>>>prett

Re: [RFC] [PATCH 0/5] procfs: reduce duplication by using symlinks

2019-03-21 Thread Jeff Mahoney

On 4/24/18 10:14 AM, Eric W. Biederman wrote:
> je...@suse.com writes:
> 
>> From: Jeff Mahoney 
>>
>> Hi all -
>>
>> I recently encountered a customer issue where, on a machine with many TiB
>> of memory and a few hundred cores, after a task with a few thousand threads
>> and hundreds of files open exited, the system would softlockup.  That
>> issue was (is still) being addressed by Nik Borisov's patch to add a
>> cond_resched call to shrink_dentry_list.  The underlying issue is still
>> there, though.  We just don't complain as loudly.  When a huge task
>> exits, now the system is more or less unresponsive for about eight
>> minutes.  All CPUs are pinned and every one of them is going through
>> dentry and inode eviction for the procfs files associated with each
>> thread.  It's made worse by every CPU contending on the super's
>> inode list lock.
>>
>> The numbers get big.  My test case was 4096 threads with 16384 files
>> open.  It's a contrived example, but not that far off from the actual
>> customer case.  In this case, a simple "find /proc" would create around
>> 300 million dentry/inode pairs.  More practically, lsof(1) does it too,
>> it just takes longer.  On smaller systems, memory pressure starts pushing
>> them out. Memory pressure isn't really an issue on this machine, so we
>> end up using well over 100GB for proc files.  It's the combination of
>> the wasted CPU cycles in teardown and the wasted memory at runtime that
>> pushed me to take this approach.
>>
>> The biggest culprit is the "fd" and "fdinfo" directories, but those are
>> made worse by there being multiple copies of them even for the same
>> task without threads getting involved:
>>
>> - /proc/pid/fd and /proc/pid/task/pid/fd are identical but share no
>>   resources.
>>
>> - Every /proc/pid/task/*/fd directory in a thread group has identical
>>   contents (unless unshare(CLONE_FILES) was called), but share no
>>   resources.
>>
>> - If we do a lookup like /proc/pid/fd on a member of a thread group,
>>   we'll get a valid directory.  Inside, there will be a complete
>>   copy of /proc/pid/task/* just like in /proc/tgid/task.  Again,
>>   nothing is shared.
>>
>> This patch set reduces some (most) of the duplication by conditionally
>> replacing some of the directories with symbolic links to copies that are
>> identical.
>>
>> 1) Eliminate the duplication of the task directories between threads.
>>The task directory belongs to the thread leader and the threads
>>link to it: e.g. /proc/915/task -> ../910/task  This mainly
>>reduces duplication when individual threads are looked up directly
>>at the tgid level.  The impact varies based on the number of threads.
>>The user has to go out of their way in order to mess up their system
>>in this way.  But if they were so inclined, they could create ~550
>>billion inodes and dentries using the test case.
>>
>> 2) Eliminate the duplication of directories that are created identically
>>between the tgid-level pid directory and its task directory: fd,
>>fdinfo, ns, net, attr.  There is obviously more duplication between
>>the two directories, but replacing a file with a symbolic link
>>doesn't get us anything.  This reduces the number of files associated
>>with fd and fdinfo by half if threads aren't involved.
>>
>> 3) Eliminate the duplication of fd and fdinfo directories among threads
>>that share a files_struct.  We check at directory creation time if
>>the task is a group leader and if not, whether it shares ->files with
>>the group leader.  If so, we create a symbolic link to ../tgid/fd*.
>>We use a d_revalidate callback to check whether the thread has called
>>unshare(CLONE_FILES) and, if so, fail the revalidation for the symlink.
>>Upon re-lookup, a directory will be created in its place.  This is
>>pretty simple, so if the thread group leader calls unshare, all threads
>>get directories.
>>
>> With these patches applied, running the same testcase, the proc_inode
>> cache only gets to about 600k objects, which is about 99.7% fewer.  I
>> get that procfs isn't supposed to be scalable, but this is kind of
>> extreme. :)
>>
>> Finally, I'm not a procfs expert.  I'm posting this as an RFC for folks
>> with more knowledge of the details to pick it apart.  The biggest is that
>> I'm not sure if any tools depend on any of these things being directories
>> instead of symlinks.  I'd hope not

Re: [PATCH] reiserfs: remove workaround code for GCC 3.x

2018-10-19 Thread Jeff Mahoney

On 8/26/18 10:33 PM, Masahiro Yamada wrote:
> Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
> bumped the minimum GCC version to 4.6 for all architectures.
> 
> The workaround code in fs/reiserfs/Makefile is obsolete now.
> 
> Signed-off-by: Masahiro Yamada 

Acked-by: Jeff Mahoney 

Thanks,

-Jeff

> ---
> 
>  fs/reiserfs/Makefile | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/fs/reiserfs/Makefile b/fs/reiserfs/Makefile
> index a39a562..bd29c58 100644
> --- a/fs/reiserfs/Makefile
> +++ b/fs/reiserfs/Makefile
> @@ -26,14 +26,5 @@ ifeq ($(CONFIG_REISERFS_FS_POSIX_ACL),y)
>  reiserfs-objs += xattr_acl.o
>  endif
>  
> -# gcc -O2 (the kernel default)  is overaggressive on ppc32 when many inline
> -# functions are used.  This causes the compiler to advance the stack
> -# pointer out of the available stack space, corrupting kernel space,
> -# and causing a panic. Since this behavior only affects ppc32, this ifeq
> -# will work around it. If any other architecture displays this behavior,
> -# add it here.
> -ccflags-$(CONFIG_PPC32) := $(call cc-ifversion, -lt, 0400, -O1)
> -
>  TAGS:
>   etags *.c
> -
> 

-- 
Jeff Mahoney
SUSE Labs




signature.asc
Description: OpenPGP digital signature

Re: [PATCH] reiserfs: remove workaround code for GCC 3.x

2018-10-19 Thread Jeff Mahoney

On 8/26/18 10:33 PM, Masahiro Yamada wrote:
> Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
> bumped the minimum GCC version to 4.6 for all architectures.
> 
> The workaround code in fs/reiserfs/Makefile is obsolete now.
> 
> Signed-off-by: Masahiro Yamada 

Acked-by: Jeff Mahoney 

Thanks,

-Jeff

> ---
> 
>  fs/reiserfs/Makefile | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/fs/reiserfs/Makefile b/fs/reiserfs/Makefile
> index a39a562..bd29c58 100644
> --- a/fs/reiserfs/Makefile
> +++ b/fs/reiserfs/Makefile
> @@ -26,14 +26,5 @@ ifeq ($(CONFIG_REISERFS_FS_POSIX_ACL),y)
>  reiserfs-objs += xattr_acl.o
>  endif
>  
> -# gcc -O2 (the kernel default)  is overaggressive on ppc32 when many inline
> -# functions are used.  This causes the compiler to advance the stack
> -# pointer out of the available stack space, corrupting kernel space,
> -# and causing a panic. Since this behavior only affects ppc32, this ifeq
> -# will work around it. If any other architecture displays this behavior,
> -# add it here.
> -ccflags-$(CONFIG_PPC32) := $(call cc-ifversion, -lt, 0400, -O1)
> -
>  TAGS:
>   etags *.c
> -
> 

-- 
Jeff Mahoney
SUSE Labs




signature.asc
Description: OpenPGP digital signature

Re: [PATCH] reiserfs: fix broken xattr handling (heap corruption, bad retval)

2018-08-13 Thread Jeff Mahoney

On 8/13/18 2:04 PM, Jann Horn wrote:
> On Mon, Aug 13, 2018 at 7:42 PM Will Deacon  wrote:
>>
>> Hi Jann,
>>
>> On Fri, Aug 10, 2018 at 05:19:38AM +0200, Jann Horn wrote:
>>> On Thu, Aug 2, 2018 at 5:16 PM Jann Horn  wrote:
>>>>
>>>> This fixes the following issues:
>>>>
>>>>  - When a buffer size is supplied to reiserfs_listxattr() such that each
>>>>individual name fits, but the concatenation of all names doesn't
>>>>fit, reiserfs_listxattr() overflows the supplied buffer. This leads to
>>>>a kernel heap overflow (verified using KASAN) followed by an
>>>>out-of-bounds usercopy and is therefore a security bug.
>>>>  - When a buffer size is supplied to reiserfs_listxattr() such that a name
>>>>doesn't fit, -ERANGE should be returned. But reiserfs instead just
>>>>truncates the list of names; I have verified that if the only xattr on
>>>>a file has a longer name than the supplied buffer length, listxattr()
>>>>incorrectly returns zero.
>>>>
>>>> With my patch applied, -ERANGE is returned in both cases and the memory
>>>> corruption doesn't happen anymore.
>>>>
>>>> Credit for making me clean this code up a bit goes to Al Viro, who pointed
>>>> out that the ->actor calling convention is suboptimal and should be
>>>> changed.
>>>>
>>>> Fixes: 48b32a3553a5 ("reiserfs: use generic xattr handlers")
>>>> Cc: sta...@vger.kernel.org
>>>> Signed-off-by: Jann Horn 
>>>
>>> +security@
>>> Ping. I have not received any replies to this patch, which fixes a
>>> kernel security bug, for a week.
>>> Whose tree should this go through? reiserfs is marked as "supported",
>>> but does not have a maintainer or a git repo listed, just a
>>> mailinglist, so I guess it probably has to go through either Al Viro's
>>> or akpm's tree? Looks like akpm signed off on the last commits in
>>> reiserfs...
>>
>> I think Andrew's tree makes the most sense for this,
> 
> Yeah, Andrew has already merged it. :)
> http://ozlabs.org/~akpm/mmots/broken-out/reiserfs-fix-broken-xattr-handling-heap-corruption-bad-retval.patch
> 
>> but perhaps we should
>> also patch MAINTAINERS so mark it as "Orphan"? Patch below.
> 
> Either that, or get someone to step up as maintainer? If I read
> https://marc.info/?l=reiserfs-devel=153214303506948=2#0 correctly,
> there's still an intent to fix things in reiserfs, even though no
> maintainer is listed. (Jeff Mahoney, who wrote that message and is
> CC'ed on this thread, seems to have been out of office last week - when
> I sent the "Ping" message a few days ago, I got a vacation
> autoresponder "I'll be out of the office until 13 August" from him.)

I suppose I can take a more active role here.  I'm probably the person
with the most experience with reiserfs who still has a role where I need
to care about it.

-Jeff

>> Will
>>
>> --->8
>>
>> From 07fbb021d5bbfe623fad10073b55704bda8e1f3d Mon Sep 17 00:00:00 2001
>> From: Will Deacon 
>> Date: Mon, 13 Aug 2018 18:31:50 +0100
>> Subject: [PATCH] MAINTAINERS: Mark reiserfs as Orphan
>>
>> Reiserfs has no Maintainer and random fixes tend to be merged through
>> with Andrew or Al's tree. Demote the filesystem to "Orphan", since it's
>> clear no longer supported by anybody.
>>
>> Reported-by: Jann Horn 
>> Signed-off-by: Will Deacon 
>> ---
>>  MAINTAINERS | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 544cac829cf4..b4fcc19cfb52 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -12077,7 +12077,7 @@ F:  include/linux/regmap.h
>>
>>  REISERFS FILE SYSTEM
>>  L: reiserfs-de...@vger.kernel.org
>> -S: Supported
>> +S: Orphan
>>  F: fs/reiserfs/
>>
>>  REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM
>> --
>> 2.1.4
> 

-- 
Jeff Mahoney
SUSE Labs




signature.asc
Description: OpenPGP digital signature

Re: [PATCH] reiserfs: fix broken xattr handling (heap corruption, bad retval)

2018-08-13 Thread Jeff Mahoney

On 8/2/18 11:15 AM, Jann Horn wrote:
> This fixes the following issues:
> 
>  - When a buffer size is supplied to reiserfs_listxattr() such that each
>individual name fits, but the concatenation of all names doesn't
>fit, reiserfs_listxattr() overflows the supplied buffer. This leads to
>a kernel heap overflow (verified using KASAN) followed by an
>out-of-bounds usercopy and is therefore a security bug.
>  - When a buffer size is supplied to reiserfs_listxattr() such that a name
>doesn't fit, -ERANGE should be returned. But reiserfs instead just
>truncates the list of names; I have verified that if the only xattr on
>a file has a longer name than the supplied buffer length, listxattr()
>incorrectly returns zero.
> 
> With my patch applied, -ERANGE is returned in both cases and the memory
> corruption doesn't happen anymore.
> 
> Credit for making me clean this code up a bit goes to Al Viro, who pointed
> out that the ->actor calling convention is suboptimal and should be
> changed.
> 
> Fixes: 48b32a3553a5 ("reiserfs: use generic xattr handlers")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Jann Horn 

Acked-by: Jeff Mahoney 

Thanks,

-Jeff

> ---
> Triggering the bug:
> 
> root@debian:/home/user# mount -o user_xattr reiserimg reisermount/
> root@debian:/home/user# cd reisermount/
> root@debian:/home/user/reisermount# touch test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo1 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo2 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo3 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo4 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo5 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo6 -v A test_file
> root@debian:/home/user/reisermount# cat xattr_test.c
> #include 
> #include 
> #include 
> #include 
> #include 
> int main(int argc, char **argv) {
>   if (argc != 2) errx(1, "bad invocation");
>   char list[10];
>   int res = listxattr(argv[1], list, sizeof(list));
>   if (res == -1)
> err(1, "listxattr failed");
>   printf("listxattr returned %d\n", res);
>   for (char *p = list; p < list+res-1; p = p + strlen(p) + 1) {
> printf("list entry: %s\n", p);
>   }
> }
> root@debian:/home/user/reisermount# gcc -o xattr_test xattr_test.c
> root@debian:/home/user/reisermount# ./xattr_test test_file
> Segmentation fault
> root@debian:/home/user/reisermount#
> 
> Result:
> 
> [  122.071318] 
> ==
> [  122.072334] BUG: KASAN: slab-out-of-bounds in listxattr_filler+0x170/0x1b0
> [  122.073173] Write of size 9 at addr 8801c43b474a by task xattr_test/923
> [  122.074030]
> [  122.074223] CPU: 1 PID: 923 Comm: xattr_test Not tainted 4.18.0-rc7+ #67
> [  122.075050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1 04/01/2014
> [  122.076107] Call Trace:
> [  122.076453]  dump_stack+0x71/0xab
> [  122.076900]  print_address_description+0x6a/0x250
> [  122.077514]  kasan_report+0x258/0x380
> [  122.077961]  ? listxattr_filler+0x170/0x1b0
> [  122.078469]  memcpy+0x34/0x50
> [  122.078894]  listxattr_filler+0x170/0x1b0
> [...]
> 
>  fs/reiserfs/xattr.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
> index ff94fad477e4..48cdfc81fe10 100644
> --- a/fs/reiserfs/xattr.c
> +++ b/fs/reiserfs/xattr.c
> @@ -792,8 +792,10 @@ static int listxattr_filler(struct dir_context *ctx, 
> const char *name,
>   return 0;
>   size = namelen + 1;
>   if (b->buf) {
> - if (size > b->size)
> + if (b->pos + size > b->size) {
> + b->pos = -ERANGE;
>   return -ERANGE;
> + }
>   memcpy(b->buf + b->pos, name, namelen);
>   b->buf[b->pos + namelen] = 0;
>   }
> 

-- 
Jeff Mahoney
SUSE Labs




signature.asc
Description: OpenPGP digital signature

Re: [PATCH] reiserfs: fix broken xattr handling (heap corruption, bad retval)

2018-08-13 Thread Jeff Mahoney

On 8/13/18 2:04 PM, Jann Horn wrote:
> On Mon, Aug 13, 2018 at 7:42 PM Will Deacon  wrote:
>>
>> Hi Jann,
>>
>> On Fri, Aug 10, 2018 at 05:19:38AM +0200, Jann Horn wrote:
>>> On Thu, Aug 2, 2018 at 5:16 PM Jann Horn  wrote:
>>>>
>>>> This fixes the following issues:
>>>>
>>>>  - When a buffer size is supplied to reiserfs_listxattr() such that each
>>>>individual name fits, but the concatenation of all names doesn't
>>>>fit, reiserfs_listxattr() overflows the supplied buffer. This leads to
>>>>a kernel heap overflow (verified using KASAN) followed by an
>>>>out-of-bounds usercopy and is therefore a security bug.
>>>>  - When a buffer size is supplied to reiserfs_listxattr() such that a name
>>>>doesn't fit, -ERANGE should be returned. But reiserfs instead just
>>>>truncates the list of names; I have verified that if the only xattr on
>>>>a file has a longer name than the supplied buffer length, listxattr()
>>>>incorrectly returns zero.
>>>>
>>>> With my patch applied, -ERANGE is returned in both cases and the memory
>>>> corruption doesn't happen anymore.
>>>>
>>>> Credit for making me clean this code up a bit goes to Al Viro, who pointed
>>>> out that the ->actor calling convention is suboptimal and should be
>>>> changed.
>>>>
>>>> Fixes: 48b32a3553a5 ("reiserfs: use generic xattr handlers")
>>>> Cc: sta...@vger.kernel.org
>>>> Signed-off-by: Jann Horn 
>>>
>>> +security@
>>> Ping. I have not received any replies to this patch, which fixes a
>>> kernel security bug, for a week.
>>> Whose tree should this go through? reiserfs is marked as "supported",
>>> but does not have a maintainer or a git repo listed, just a
>>> mailinglist, so I guess it probably has to go through either Al Viro's
>>> or akpm's tree? Looks like akpm signed off on the last commits in
>>> reiserfs...
>>
>> I think Andrew's tree makes the most sense for this,
> 
> Yeah, Andrew has already merged it. :)
> http://ozlabs.org/~akpm/mmots/broken-out/reiserfs-fix-broken-xattr-handling-heap-corruption-bad-retval.patch
> 
>> but perhaps we should
>> also patch MAINTAINERS so mark it as "Orphan"? Patch below.
> 
> Either that, or get someone to step up as maintainer? If I read
> https://marc.info/?l=reiserfs-devel=153214303506948=2#0 correctly,
> there's still an intent to fix things in reiserfs, even though no
> maintainer is listed. (Jeff Mahoney, who wrote that message and is
> CC'ed on this thread, seems to have been out of office last week - when
> I sent the "Ping" message a few days ago, I got a vacation
> autoresponder "I'll be out of the office until 13 August" from him.)

I suppose I can take a more active role here.  I'm probably the person
with the most experience with reiserfs who still has a role where I need
to care about it.

-Jeff

>> Will
>>
>> --->8
>>
>> From 07fbb021d5bbfe623fad10073b55704bda8e1f3d Mon Sep 17 00:00:00 2001
>> From: Will Deacon 
>> Date: Mon, 13 Aug 2018 18:31:50 +0100
>> Subject: [PATCH] MAINTAINERS: Mark reiserfs as Orphan
>>
>> Reiserfs has no Maintainer and random fixes tend to be merged through
>> with Andrew or Al's tree. Demote the filesystem to "Orphan", since it's
>> clear no longer supported by anybody.
>>
>> Reported-by: Jann Horn 
>> Signed-off-by: Will Deacon 
>> ---
>>  MAINTAINERS | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 544cac829cf4..b4fcc19cfb52 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -12077,7 +12077,7 @@ F:  include/linux/regmap.h
>>
>>  REISERFS FILE SYSTEM
>>  L: reiserfs-de...@vger.kernel.org
>> -S: Supported
>> +S: Orphan
>>  F: fs/reiserfs/
>>
>>  REMOTE PROCESSOR (REMOTEPROC) SUBSYSTEM
>> --
>> 2.1.4
> 

-- 
Jeff Mahoney
SUSE Labs




signature.asc
Description: OpenPGP digital signature

Re: [PATCH] reiserfs: fix broken xattr handling (heap corruption, bad retval)

2018-08-13 Thread Jeff Mahoney

On 8/2/18 11:15 AM, Jann Horn wrote:
> This fixes the following issues:
> 
>  - When a buffer size is supplied to reiserfs_listxattr() such that each
>individual name fits, but the concatenation of all names doesn't
>fit, reiserfs_listxattr() overflows the supplied buffer. This leads to
>a kernel heap overflow (verified using KASAN) followed by an
>out-of-bounds usercopy and is therefore a security bug.
>  - When a buffer size is supplied to reiserfs_listxattr() such that a name
>doesn't fit, -ERANGE should be returned. But reiserfs instead just
>truncates the list of names; I have verified that if the only xattr on
>a file has a longer name than the supplied buffer length, listxattr()
>incorrectly returns zero.
> 
> With my patch applied, -ERANGE is returned in both cases and the memory
> corruption doesn't happen anymore.
> 
> Credit for making me clean this code up a bit goes to Al Viro, who pointed
> out that the ->actor calling convention is suboptimal and should be
> changed.
> 
> Fixes: 48b32a3553a5 ("reiserfs: use generic xattr handlers")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Jann Horn 

Acked-by: Jeff Mahoney 

Thanks,

-Jeff

> ---
> Triggering the bug:
> 
> root@debian:/home/user# mount -o user_xattr reiserimg reisermount/
> root@debian:/home/user# cd reisermount/
> root@debian:/home/user/reisermount# touch test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo1 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo2 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo3 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo4 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo5 -v A test_file
> root@debian:/home/user/reisermount# setfattr -n user.foo6 -v A test_file
> root@debian:/home/user/reisermount# cat xattr_test.c
> #include 
> #include 
> #include 
> #include 
> #include 
> int main(int argc, char **argv) {
>   if (argc != 2) errx(1, "bad invocation");
>   char list[10];
>   int res = listxattr(argv[1], list, sizeof(list));
>   if (res == -1)
> err(1, "listxattr failed");
>   printf("listxattr returned %d\n", res);
>   for (char *p = list; p < list+res-1; p = p + strlen(p) + 1) {
> printf("list entry: %s\n", p);
>   }
> }
> root@debian:/home/user/reisermount# gcc -o xattr_test xattr_test.c
> root@debian:/home/user/reisermount# ./xattr_test test_file
> Segmentation fault
> root@debian:/home/user/reisermount#
> 
> Result:
> 
> [  122.071318] 
> ==
> [  122.072334] BUG: KASAN: slab-out-of-bounds in listxattr_filler+0x170/0x1b0
> [  122.073173] Write of size 9 at addr 8801c43b474a by task xattr_test/923
> [  122.074030]
> [  122.074223] CPU: 1 PID: 923 Comm: xattr_test Not tainted 4.18.0-rc7+ #67
> [  122.075050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1 04/01/2014
> [  122.076107] Call Trace:
> [  122.076453]  dump_stack+0x71/0xab
> [  122.076900]  print_address_description+0x6a/0x250
> [  122.077514]  kasan_report+0x258/0x380
> [  122.077961]  ? listxattr_filler+0x170/0x1b0
> [  122.078469]  memcpy+0x34/0x50
> [  122.078894]  listxattr_filler+0x170/0x1b0
> [...]
> 
>  fs/reiserfs/xattr.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
> index ff94fad477e4..48cdfc81fe10 100644
> --- a/fs/reiserfs/xattr.c
> +++ b/fs/reiserfs/xattr.c
> @@ -792,8 +792,10 @@ static int listxattr_filler(struct dir_context *ctx, 
> const char *name,
>   return 0;
>   size = namelen + 1;
>   if (b->buf) {
> - if (size > b->size)
> + if (b->pos + size > b->size) {
> + b->pos = -ERANGE;
>   return -ERANGE;
> + }
>   memcpy(b->buf + b->pos, name, namelen);
>   b->buf[b->pos + namelen] = 0;
>   }
> 

-- 
Jeff Mahoney
SUSE Labs




signature.asc
Description: OpenPGP digital signature

Re: [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root

2018-05-08 Thread Jeff Mahoney

rc.info/?l=linux-btrfs=130532890824992=2
>>
>> During the discussion, one question did come up - why can't
>> filesystems like Btrfs use a superblock per subvolume? There's a
>> couple of problems with that:
>>
>> - It's common for a single Btrfs filesystem to have thousands of
>>   subvolumes. So keeping a superblock for each subvol in memory would
>>   get prohibively expensive - imagine having 8000 copies of struct
>>   super_block for a file system just because we wanted some separation
>>   of say, s_dev.
> 
> That's no different to using individual overlay mounts for the
> thousands of containers that are on the system. This doesn't seem to
> be a major problem...

Overlay mounts are indepedent of one another and don't need coordination
among them.  The memory usage is relatively unimportant.  The important
part is having a bunch of superblocks that all correspond to the same
resources and coordinating them at the VFS level.  Your assumptions
below follow how your XFS subvolumes work, where there's a clear
hierarchy between the subvolumes and the master filesystem with a
mapping layer between them.  Btrfs subvolumes have no such hierarchy.
Everything is shared.  So while we could use a writeback hierarchy to
merge all of the inode lists before doing writeback on the 'master'
superblock, we'd gain nothing by it.  Handling anything involving
s_umount with a superblock per subvolume would be a nightmare.
Ultimately, it would be a ton of effort that amounts to working around
the VFS instead of with it.

>> - Writeback would also have to walk all of these superblocks -
>>   again not very good for system performance.
> 
> Background writeback is backing device focussed, not superblock
> focussed. It will only iterate the superblocks that have dirty
> inodes on the bdi writeback lists, not all the superblocks on the
> bdi. IOWs, this isn't a major problem except for sync() operations
> that iterate superblocks.
> 
>> - Anyone wanting to lock down I/O on a filesystem would have to
>> freeze all the superblocks. This goes for most things related to
>> I/O really - we simply can't afford to have the kernel walking
>> thousands of superblocks to sync a single fs.
> 
> Not with XFS subvolumes. Freezing the underlying parent filesystem
> will effectively stop all IO from the mounted subvolumes by freezing
> remapping calls before IO. Sure, those subvolumes aren't in a
> consistent state, but we don't freeze userspace so none of the
> application data is ever in a consistent state when filesystems are
> frozen.
> 
> So, again, I'm not sure there's /subvolume/ problem here. There's
> definitely a "freeze heirarchy" problem, but that already exists and
> it's something we talked about at LSFMM because we need to solve it
> for reliable hibernation.

There's only a freeze hierarchy problem if we have to use multiple
superblocks.  Otherwise, we freeze the whole thing or we don't.  Trying
to freeze a single subvolume would be an illusion for the user since the
underlying file system would still be active underneath it.  Under the
hood, things like relocation don't even look at what subvolume owns a
particular extent until it must.  So it could be coordinating thousands
of superblocks to do what a single lock does now and for what benefit?

>> It's far more efficient then to pull those fields we need for a
>> subvolume namespace into their own structure.
>
> I'm not convinced yet - it still feels like it's the wrong layer to
> be solving the multiple namespace per superblock problem

It needs to be between the inode and the superblock.  If there are
multiple user-visible namespaces, each will still get the same
underlying file system namespace.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [RFC][PATCH 0/76] vfs: 'views' for filesystems with more than one root

2018-05-08 Thread Jeff Mahoney

rc.info/?l=linux-btrfs=130532890824992=2
>>
>> During the discussion, one question did come up - why can't
>> filesystems like Btrfs use a superblock per subvolume? There's a
>> couple of problems with that:
>>
>> - It's common for a single Btrfs filesystem to have thousands of
>>   subvolumes. So keeping a superblock for each subvol in memory would
>>   get prohibively expensive - imagine having 8000 copies of struct
>>   super_block for a file system just because we wanted some separation
>>   of say, s_dev.
> 
> That's no different to using individual overlay mounts for the
> thousands of containers that are on the system. This doesn't seem to
> be a major problem...

Overlay mounts are indepedent of one another and don't need coordination
among them.  The memory usage is relatively unimportant.  The important
part is having a bunch of superblocks that all correspond to the same
resources and coordinating them at the VFS level.  Your assumptions
below follow how your XFS subvolumes work, where there's a clear
hierarchy between the subvolumes and the master filesystem with a
mapping layer between them.  Btrfs subvolumes have no such hierarchy.
Everything is shared.  So while we could use a writeback hierarchy to
merge all of the inode lists before doing writeback on the 'master'
superblock, we'd gain nothing by it.  Handling anything involving
s_umount with a superblock per subvolume would be a nightmare.
Ultimately, it would be a ton of effort that amounts to working around
the VFS instead of with it.

>> - Writeback would also have to walk all of these superblocks -
>>   again not very good for system performance.
> 
> Background writeback is backing device focussed, not superblock
> focussed. It will only iterate the superblocks that have dirty
> inodes on the bdi writeback lists, not all the superblocks on the
> bdi. IOWs, this isn't a major problem except for sync() operations
> that iterate superblocks.
> 
>> - Anyone wanting to lock down I/O on a filesystem would have to
>> freeze all the superblocks. This goes for most things related to
>> I/O really - we simply can't afford to have the kernel walking
>> thousands of superblocks to sync a single fs.
> 
> Not with XFS subvolumes. Freezing the underlying parent filesystem
> will effectively stop all IO from the mounted subvolumes by freezing
> remapping calls before IO. Sure, those subvolumes aren't in a
> consistent state, but we don't freeze userspace so none of the
> application data is ever in a consistent state when filesystems are
> frozen.
> 
> So, again, I'm not sure there's /subvolume/ problem here. There's
> definitely a "freeze heirarchy" problem, but that already exists and
> it's something we talked about at LSFMM because we need to solve it
> for reliable hibernation.

There's only a freeze hierarchy problem if we have to use multiple
superblocks.  Otherwise, we freeze the whole thing or we don't.  Trying
to freeze a single subvolume would be an illusion for the user since the
underlying file system would still be active underneath it.  Under the
hood, things like relocation don't even look at what subvolume owns a
particular extent until it must.  So it could be coordinating thousands
of superblocks to do what a single lock does now and for what benefit?

>> It's far more efficient then to pull those fields we need for a
>> subvolume namespace into their own structure.
>
> I'm not convinced yet - it still feels like it's the wrong layer to
> be solving the multiple namespace per superblock problem

It needs to be between the inode and the superblock.  If there are
multiple user-visible namespaces, each will still get the same
underlying file system namespace.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [RFC] [PATCH 0/5] procfs: reduce duplication by using symlinks

2018-04-26 Thread Jeff Mahoney

On 4/24/18 10:14 AM, Eric W. Biederman wrote:
> je...@suse.com writes:
> 
>> From: Jeff Mahoney <je...@suse.com>
>>
>> Hi all -
>>
>> I recently encountered a customer issue where, on a machine with many TiB
>> of memory and a few hundred cores, after a task with a few thousand threads
>> and hundreds of files open exited, the system would softlockup.  That
>> issue was (is still) being addressed by Nik Borisov's patch to add a
>> cond_resched call to shrink_dentry_list.  The underlying issue is still
>> there, though.  We just don't complain as loudly.  When a huge task
>> exits, now the system is more or less unresponsive for about eight
>> minutes.  All CPUs are pinned and every one of them is going through
>> dentry and inode eviction for the procfs files associated with each
>> thread.  It's made worse by every CPU contending on the super's
>> inode list lock.
>>
>> The numbers get big.  My test case was 4096 threads with 16384 files
>> open.  It's a contrived example, but not that far off from the actual
>> customer case.  In this case, a simple "find /proc" would create around
>> 300 million dentry/inode pairs.  More practically, lsof(1) does it too,
>> it just takes longer.  On smaller systems, memory pressure starts pushing
>> them out. Memory pressure isn't really an issue on this machine, so we
>> end up using well over 100GB for proc files.  It's the combination of
>> the wasted CPU cycles in teardown and the wasted memory at runtime that
>> pushed me to take this approach.
>>
>> The biggest culprit is the "fd" and "fdinfo" directories, but those are
>> made worse by there being multiple copies of them even for the same
>> task without threads getting involved:
>>
>> - /proc/pid/fd and /proc/pid/task/pid/fd are identical but share no
>>   resources.
>>
>> - Every /proc/pid/task/*/fd directory in a thread group has identical
>>   contents (unless unshare(CLONE_FILES) was called), but share no
>>   resources.
>>
>> - If we do a lookup like /proc/pid/fd on a member of a thread group,
>>   we'll get a valid directory.  Inside, there will be a complete
>>   copy of /proc/pid/task/* just like in /proc/tgid/task.  Again,
>>   nothing is shared.
>>
>> This patch set reduces some (most) of the duplication by conditionally
>> replacing some of the directories with symbolic links to copies that are
>> identical.
>>
>> 1) Eliminate the duplication of the task directories between threads.
>>The task directory belongs to the thread leader and the threads
>>link to it: e.g. /proc/915/task -> ../910/task  This mainly
>>reduces duplication when individual threads are looked up directly
>>at the tgid level.  The impact varies based on the number of threads.
>>The user has to go out of their way in order to mess up their system
>>in this way.  But if they were so inclined, they could create ~550
>>billion inodes and dentries using the test case.
>>
>> 2) Eliminate the duplication of directories that are created identically
>>between the tgid-level pid directory and its task directory: fd,
>>fdinfo, ns, net, attr.  There is obviously more duplication between
>>the two directories, but replacing a file with a symbolic link
>>doesn't get us anything.  This reduces the number of files associated
>>with fd and fdinfo by half if threads aren't involved.
>>
>> 3) Eliminate the duplication of fd and fdinfo directories among threads
>>that share a files_struct.  We check at directory creation time if
>>the task is a group leader and if not, whether it shares ->files with
>>the group leader.  If so, we create a symbolic link to ../tgid/fd*.
>>We use a d_revalidate callback to check whether the thread has called
>>unshare(CLONE_FILES) and, if so, fail the revalidation for the symlink.
>>Upon re-lookup, a directory will be created in its place.  This is
>>pretty simple, so if the thread group leader calls unshare, all threads
>>get directories.
>>
>> With these patches applied, running the same testcase, the proc_inode
>> cache only gets to about 600k objects, which is about 99.7% fewer.  I
>> get that procfs isn't supposed to be scalable, but this is kind of
>> extreme. :)
>>
>> Finally, I'm not a procfs expert.  I'm posting this as an RFC for folks
>> with more knowledge of the details to pick it apart.  The biggest is that
>> I'm not sure if any tools depend on any of these things being directories
>> instead of symli

Re: [RFC] [PATCH 0/5] procfs: reduce duplication by using symlinks

2018-04-26 Thread Jeff Mahoney

On 4/24/18 10:14 AM, Eric W. Biederman wrote:
> je...@suse.com writes:
> 
>> From: Jeff Mahoney 
>>
>> Hi all -
>>
>> I recently encountered a customer issue where, on a machine with many TiB
>> of memory and a few hundred cores, after a task with a few thousand threads
>> and hundreds of files open exited, the system would softlockup.  That
>> issue was (is still) being addressed by Nik Borisov's patch to add a
>> cond_resched call to shrink_dentry_list.  The underlying issue is still
>> there, though.  We just don't complain as loudly.  When a huge task
>> exits, now the system is more or less unresponsive for about eight
>> minutes.  All CPUs are pinned and every one of them is going through
>> dentry and inode eviction for the procfs files associated with each
>> thread.  It's made worse by every CPU contending on the super's
>> inode list lock.
>>
>> The numbers get big.  My test case was 4096 threads with 16384 files
>> open.  It's a contrived example, but not that far off from the actual
>> customer case.  In this case, a simple "find /proc" would create around
>> 300 million dentry/inode pairs.  More practically, lsof(1) does it too,
>> it just takes longer.  On smaller systems, memory pressure starts pushing
>> them out. Memory pressure isn't really an issue on this machine, so we
>> end up using well over 100GB for proc files.  It's the combination of
>> the wasted CPU cycles in teardown and the wasted memory at runtime that
>> pushed me to take this approach.
>>
>> The biggest culprit is the "fd" and "fdinfo" directories, but those are
>> made worse by there being multiple copies of them even for the same
>> task without threads getting involved:
>>
>> - /proc/pid/fd and /proc/pid/task/pid/fd are identical but share no
>>   resources.
>>
>> - Every /proc/pid/task/*/fd directory in a thread group has identical
>>   contents (unless unshare(CLONE_FILES) was called), but share no
>>   resources.
>>
>> - If we do a lookup like /proc/pid/fd on a member of a thread group,
>>   we'll get a valid directory.  Inside, there will be a complete
>>   copy of /proc/pid/task/* just like in /proc/tgid/task.  Again,
>>   nothing is shared.
>>
>> This patch set reduces some (most) of the duplication by conditionally
>> replacing some of the directories with symbolic links to copies that are
>> identical.
>>
>> 1) Eliminate the duplication of the task directories between threads.
>>The task directory belongs to the thread leader and the threads
>>link to it: e.g. /proc/915/task -> ../910/task  This mainly
>>reduces duplication when individual threads are looked up directly
>>at the tgid level.  The impact varies based on the number of threads.
>>The user has to go out of their way in order to mess up their system
>>in this way.  But if they were so inclined, they could create ~550
>>billion inodes and dentries using the test case.
>>
>> 2) Eliminate the duplication of directories that are created identically
>>between the tgid-level pid directory and its task directory: fd,
>>fdinfo, ns, net, attr.  There is obviously more duplication between
>>the two directories, but replacing a file with a symbolic link
>>doesn't get us anything.  This reduces the number of files associated
>>with fd and fdinfo by half if threads aren't involved.
>>
>> 3) Eliminate the duplication of fd and fdinfo directories among threads
>>that share a files_struct.  We check at directory creation time if
>>the task is a group leader and if not, whether it shares ->files with
>>the group leader.  If so, we create a symbolic link to ../tgid/fd*.
>>We use a d_revalidate callback to check whether the thread has called
>>unshare(CLONE_FILES) and, if so, fail the revalidation for the symlink.
>>Upon re-lookup, a directory will be created in its place.  This is
>>pretty simple, so if the thread group leader calls unshare, all threads
>>get directories.
>>
>> With these patches applied, running the same testcase, the proc_inode
>> cache only gets to about 600k objects, which is about 99.7% fewer.  I
>> get that procfs isn't supposed to be scalable, but this is kind of
>> extreme. :)
>>
>> Finally, I'm not a procfs expert.  I'm posting this as an RFC for folks
>> with more knowledge of the details to pick it apart.  The biggest is that
>> I'm not sure if any tools depend on any of these things being directories
>> instead of symlinks.  I'd hope

Re: [RFC] [PATCH 0/5] procfs: reduce duplication by using symlinks

2018-04-25 Thread Jeff Mahoney

On 4/24/18 2:17 AM, Alexey Dobriyan wrote:
> On Mon, Apr 23, 2018 at 10:21:01PM -0400, je...@suse.com wrote:
>> Memory pressure isn't really an issue on this machine, so we
>> end up using well over 100GB for proc files.
> 
> Text files at scale!
> 
>> With these patches applied, running the same testcase, the proc_inode
>> cache only gets to about 600k objects, which is about 99.7% fewer.  I
>> get that procfs isn't supposed to be scalable, but this is kind of
>> extreme. :)>
> Easy stuff:
> * all ->get_link hooks are broken in RCU lookup (use GFP_KERNEL),

It's a pretty common pattern in the kernel, but it's just as easy to set
inode->i_link during instantiation and keep RCU lookup.  There aren't so
many of these to make it a real burden on memory.

> * "%.*s" for dentry names is probably unnecessary,
>   they're always NUL terminated

Ack.

> * kasprintf() does printing twice, since we're kind of care about /proc
>   performance, allocate for the worst case.

Ack, integrated with ->get_link fix.

> * "int nlinks = nlink_tgid;"
>   Unsigned police.

Ack.  nlink_t{,g}id are both u8, but it's easy to make it consistent.

> * (inode->i_mode & S_IFLNK)
>   this is sketchy, S_ISLNK exists.
> 

Ack.

Notes of my own:

proc_task_count_links also had the logic backward.  It would add an
extra link to the count for the symlink rather than the dir.

proc_pid_files_revalidate only needs to check if the tasks share files
since it won't be called if it's not a symlink.

Thanks for the review,

-Jeff

-- 
Jeff Mahoney
SUSE Labs

Re: [RFC] [PATCH 0/5] procfs: reduce duplication by using symlinks

2018-04-25 Thread Jeff Mahoney

On 4/24/18 2:17 AM, Alexey Dobriyan wrote:
> On Mon, Apr 23, 2018 at 10:21:01PM -0400, je...@suse.com wrote:
>> Memory pressure isn't really an issue on this machine, so we
>> end up using well over 100GB for proc files.
> 
> Text files at scale!
> 
>> With these patches applied, running the same testcase, the proc_inode
>> cache only gets to about 600k objects, which is about 99.7% fewer.  I
>> get that procfs isn't supposed to be scalable, but this is kind of
>> extreme. :)>
> Easy stuff:
> * all ->get_link hooks are broken in RCU lookup (use GFP_KERNEL),

It's a pretty common pattern in the kernel, but it's just as easy to set
inode->i_link during instantiation and keep RCU lookup.  There aren't so
many of these to make it a real burden on memory.

> * "%.*s" for dentry names is probably unnecessary,
>   they're always NUL terminated

Ack.

> * kasprintf() does printing twice, since we're kind of care about /proc
>   performance, allocate for the worst case.

Ack, integrated with ->get_link fix.

> * "int nlinks = nlink_tgid;"
>   Unsigned police.

Ack.  nlink_t{,g}id are both u8, but it's easy to make it consistent.

> * (inode->i_mode & S_IFLNK)
>   this is sketchy, S_ISLNK exists.
> 

Ack.

Notes of my own:

proc_task_count_links also had the logic backward.  It would add an
extra link to the count for the symlink rather than the dir.

proc_pid_files_revalidate only needs to check if the tasks share files
since it won't be called if it's not a symlink.

Thanks for the review,

-Jeff

-- 
Jeff Mahoney
SUSE Labs

Re: [PATCH?] reiserfs: prevent panic: don't allow %-char in journal dev. name

2018-04-06 Thread Jeff Mahoney

On 4/5/18 5:04 AM, Rasmus Villemoes wrote:
> On 2018-04-05 03:45, Andrew Morton wrote:
>> On Wed, 4 Apr 2018 18:25:16 -0700 Randy Dunlap <rdun...@infradead.org> wrote:
>>
>>> From: Randy Dunlap <rdun...@infradead.org>
>>>
>>> If the reiserfs mount option's journal name contains a '%' character,
>>> it can lead to a WARN_ONCE() in lib/vsprintf.c::format_decode(),
>>> saying: "Please remove unsupported %/ in format string."
>>> That's OK until panic_on_warn is set, at which point it's dead, Jim.
>>>
>>> To placate this situation, check the journal name string for a '%'
>>> character and return an error if one is found. Also print a warning
>>> (one that won't panic the kernel) about the invalid journal name (e.g.):
>>>
>>>   reiserfs: journal device name is invalid: %/file0
>>>
>>> (In this example, the caller app specified the journal device name as
>>> "%/file0".)
>>>
>>
>> Well, that is a valid filename and we should support it...
>>
>> Isn't the bug in journal_init_dev()?
> 
> Urgh. At first I was about to reply that the real bug was in reiserfs.h
> for failing to annotate __reiserfs_warning with __printf(). But digging
> into it, it turns out that it implements its own printf extensions, so
> that's obviously a non-starter. Now, one thing is that some of those
> extension clash with existing standard modifiers (%z and %h, so if
> someone adds a correct %zu thing to print a size_t in reiserfs things
> will break). But, and I hope I'm wrong about this and just hasn't had
> enough coffee, this seems completely broken:

Yep.  There are a bunch of ways that this is broken, but it's been "good
enough" for so long that no fix has landed.  Once upon a time, I wanted
to fix this by adding something similar to %pV that allowed the caller
to pass a set of handlers for additional types.  That didn't make it off
the ground.

There's another issue where we assume that % will only be followed by a
single character.  That won't cause runtime issues, but it will end up
putting those additional characters in the output.

Lastly, again not a runtime issue, is that the spinlock only covers
formatting the buffer.  It doesn't cover printing it.  You can end up
with part of the error buffer containing the format of another warning.

I'm working up something to fix most of the above.  I'll post it later
today or Monday.

-Jeff

> while ((k = is_there_reiserfs_struct(fmt1, )) != NULL) {
> *k = 0;
> 
> p += vsprintf(p, fmt1, args);
> 
> switch (what) {
> case 'k':
> sprintf_le_key(p, va_arg(args, struct
> reiserfs_key *));
> break;
> 
> On architectures where va_list is a typedef for a one-element array of
> some struct (x86-64), that works ok, because the vsprintf call can and
> does update the args metadata. But when args is just a pointer into the
> stack (i386), we don't know how much vsprintf consumed, and end up
> consuming the same arguments again - only this time we may interpret
> some random integer as a struct pointer...
> 
> A minimal program showing the difference:
> 
> #include 
> #include 
> 
> void f(const char *dummy, ...)
> {
>   va_list ap;
>   int i;
> 
>   va_start(ap, dummy);
>   for (i = 0; i < 5; ++i) {
>   vprintf("%d\n", ap);
>   printf("%d\n", va_arg(ap, int));
>   }
>   va_end(ap);
> }
> 
> int main(int argc, char *argv[])
> {
>   f("bla", 1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
>   return 0;
> }
> 
> Compiling for native (x86-64), this produces $(seq 10). But with -m32,
> one gets 1,1,2,2,3,3,4,4,5,5.
> 
> Assuming reiserfs (at least its debugging infrastructure) isn't broken
> on a bunch of architectures, I'm obviously missing something
> fundamental. Please enlighten me.
> 
> Rasmus
> 


-- 
Jeff Mahoney
SUSE Labs

Re: [PATCH?] reiserfs: prevent panic: don't allow %-char in journal dev. name

2018-04-06 Thread Jeff Mahoney

On 4/5/18 5:04 AM, Rasmus Villemoes wrote:
> On 2018-04-05 03:45, Andrew Morton wrote:
>> On Wed, 4 Apr 2018 18:25:16 -0700 Randy Dunlap  wrote:
>>
>>> From: Randy Dunlap 
>>>
>>> If the reiserfs mount option's journal name contains a '%' character,
>>> it can lead to a WARN_ONCE() in lib/vsprintf.c::format_decode(),
>>> saying: "Please remove unsupported %/ in format string."
>>> That's OK until panic_on_warn is set, at which point it's dead, Jim.
>>>
>>> To placate this situation, check the journal name string for a '%'
>>> character and return an error if one is found. Also print a warning
>>> (one that won't panic the kernel) about the invalid journal name (e.g.):
>>>
>>>   reiserfs: journal device name is invalid: %/file0
>>>
>>> (In this example, the caller app specified the journal device name as
>>> "%/file0".)
>>>
>>
>> Well, that is a valid filename and we should support it...
>>
>> Isn't the bug in journal_init_dev()?
> 
> Urgh. At first I was about to reply that the real bug was in reiserfs.h
> for failing to annotate __reiserfs_warning with __printf(). But digging
> into it, it turns out that it implements its own printf extensions, so
> that's obviously a non-starter. Now, one thing is that some of those
> extension clash with existing standard modifiers (%z and %h, so if
> someone adds a correct %zu thing to print a size_t in reiserfs things
> will break). But, and I hope I'm wrong about this and just hasn't had
> enough coffee, this seems completely broken:

Yep.  There are a bunch of ways that this is broken, but it's been "good
enough" for so long that no fix has landed.  Once upon a time, I wanted
to fix this by adding something similar to %pV that allowed the caller
to pass a set of handlers for additional types.  That didn't make it off
the ground.

There's another issue where we assume that % will only be followed by a
single character.  That won't cause runtime issues, but it will end up
putting those additional characters in the output.

Lastly, again not a runtime issue, is that the spinlock only covers
formatting the buffer.  It doesn't cover printing it.  You can end up
with part of the error buffer containing the format of another warning.

I'm working up something to fix most of the above.  I'll post it later
today or Monday.

-Jeff

> while ((k = is_there_reiserfs_struct(fmt1, )) != NULL) {
> *k = 0;
> 
> p += vsprintf(p, fmt1, args);
> 
> switch (what) {
> case 'k':
> sprintf_le_key(p, va_arg(args, struct
> reiserfs_key *));
> break;
> 
> On architectures where va_list is a typedef for a one-element array of
> some struct (x86-64), that works ok, because the vsprintf call can and
> does update the args metadata. But when args is just a pointer into the
> stack (i386), we don't know how much vsprintf consumed, and end up
> consuming the same arguments again - only this time we may interpret
> some random integer as a struct pointer...
> 
> A minimal program showing the difference:
> 
> #include 
> #include 
> 
> void f(const char *dummy, ...)
> {
>   va_list ap;
>   int i;
> 
>   va_start(ap, dummy);
>   for (i = 0; i < 5; ++i) {
>   vprintf("%d\n", ap);
>   printf("%d\n", va_arg(ap, int));
>   }
>   va_end(ap);
> }
> 
> int main(int argc, char *argv[])
> {
>   f("bla", 1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
>   return 0;
> }
> 
> Compiling for native (x86-64), this produces $(seq 10). But with -m32,
> one gets 1,1,2,2,3,3,4,4,5,5.
> 
> Assuming reiserfs (at least its debugging infrastructure) isn't broken
> on a bunch of architectures, I'm obviously missing something
> fundamental. Please enlighten me.
> 
> Rasmus
> 


-- 
Jeff Mahoney
SUSE Labs

Re: [PATCH?] reiserfs: prevent panic: don't allow %-char in journal dev. name

2018-04-04 Thread Jeff Mahoney

On 4/4/18 9:45 PM, Andrew Morton wrote:
> On Wed, 4 Apr 2018 18:25:16 -0700 Randy Dunlap <rdun...@infradead.org> wrote:
> 
>> From: Randy Dunlap <rdun...@infradead.org>
>>
>> If the reiserfs mount option's journal name contains a '%' character,
>> it can lead to a WARN_ONCE() in lib/vsprintf.c::format_decode(),
>> saying: "Please remove unsupported %/ in format string."
>> That's OK until panic_on_warn is set, at which point it's dead, Jim.
>>
>> To placate this situation, check the journal name string for a '%'
>> character and return an error if one is found. Also print a warning
>> (one that won't panic the kernel) about the invalid journal name (e.g.):
>>
>>   reiserfs: journal device name is invalid: %/file0
>>
>> (In this example, the caller app specified the journal device name as
>> "%/file0".)
>>
> 
> Well, that is a valid filename and we should support it...
> 
> Isn't the bug in journal_init_dev()?

Yep.  That's exactly it.

Acked-by: Jeff Mahoney <je...@suse.com>

Thanks,

-Jeff

> --- a/fs/reiserfs/journal.c~a
> +++ a/fs/reiserfs/journal.c
> @@ -2643,7 +2643,7 @@ static int journal_init_dev(struct super
>   if (IS_ERR(journal->j_dev_bd)) {
>   result = PTR_ERR(journal->j_dev_bd);
>   journal->j_dev_bd = NULL;
> - reiserfs_warning(super,
> + reiserfs_warning(super, "sh-457",
>"journal_init_dev: Cannot open '%s': %i",
>jdev_name, result);
>   return result;
> _
> 
> 


-- 
Jeff Mahoney
SUSE Labs

Re: [PATCH?] reiserfs: prevent panic: don't allow %-char in journal dev. name

2018-04-04 Thread Jeff Mahoney

On 4/4/18 9:45 PM, Andrew Morton wrote:
> On Wed, 4 Apr 2018 18:25:16 -0700 Randy Dunlap  wrote:
> 
>> From: Randy Dunlap 
>>
>> If the reiserfs mount option's journal name contains a '%' character,
>> it can lead to a WARN_ONCE() in lib/vsprintf.c::format_decode(),
>> saying: "Please remove unsupported %/ in format string."
>> That's OK until panic_on_warn is set, at which point it's dead, Jim.
>>
>> To placate this situation, check the journal name string for a '%'
>> character and return an error if one is found. Also print a warning
>> (one that won't panic the kernel) about the invalid journal name (e.g.):
>>
>>   reiserfs: journal device name is invalid: %/file0
>>
>> (In this example, the caller app specified the journal device name as
>> "%/file0".)
>>
> 
> Well, that is a valid filename and we should support it...
> 
> Isn't the bug in journal_init_dev()?

Yep.  That's exactly it.

Acked-by: Jeff Mahoney 

Thanks,

-Jeff

> --- a/fs/reiserfs/journal.c~a
> +++ a/fs/reiserfs/journal.c
> @@ -2643,7 +2643,7 @@ static int journal_init_dev(struct super
>   if (IS_ERR(journal->j_dev_bd)) {
>   result = PTR_ERR(journal->j_dev_bd);
>   journal->j_dev_bd = NULL;
> - reiserfs_warning(super,
> + reiserfs_warning(super, "sh-457",
>    "journal_init_dev: Cannot open '%s': %i",
>jdev_name, result);
>   return result;
> _
> 
> 


-- 
Jeff Mahoney
SUSE Labs

Re: [PATCH 4/9] build: update AC_PACKAGE_WANT_GDBM() and src/dbtest.c to build

2018-03-15 Thread Jeff Mahoney

On 12/14/17 12:51 AM, Eryu Guan wrote:
> On Tue, Dec 12, 2017 at 04:45:14PM -0800, Luis R. Rodriguez wrote:
>> Modern gdbm-devel packages bundle together gdbm.h and ndbm.h.
>> The old m4 macro had detection support for some old gdbm libraries
>> but not for new ones.
>>
>> We fix compilation of src/dbtest.c by making the autoconf helper
>> check for this new arrangement:
>>
>> If both gdbm.h and gdbm.h are found define set both gdbm_ndbm_=true,
>  ^^ ndbm.h?
>> and have_db=true, and define HAVE_GDBM_H. The src/dbtest.c already
>> had a HAVE_GDBM_H but there was never a respective autoconf settter for
>> it. We can just re-use this and fix it for new arrangement.
>>
>> Signed-off-by: Luis R. Rodriguez <mcg...@kernel.org>
> 
> This looks fine to me.
> 
> The only system I have by hand that have both  and  but
> not any  is openSUSE Tumbleweed. Without this patch,
> dbtest was not built on openSUSE, and was built successfully with this
> patch applied. And dbtest is still built on RHEL6/7 and Fedora.
> 
> BTW, I'll queue patch 3 and this patch for next fstests release, while
> other patches seem not necessary, I agreed with Dave that groups are not
> for excluding tests, the required tools and environments should be
> detected by tests and _notrun if not met. (The README change looks fine,
> but it doesn't apply due to the "fsgqa-381" change, so I drop it too for
> now.)

Hi guys -

This change breaks on older releases like SLES 11 where both 
and  define datum, so we get build failures.  The failure is
new, but not because it used to pass and now doesn't.  It's apparently
never built on SLES releases since we ship /usr/include/ndbm.h and then
we notrun the test that uses.  Now that we're looking for gdbm.h and
find it, we attempt to build src/dbtest and fail.

This fix isn't the right solution.  The problem is that we have a couple
layers of old cruft that needs to be cleaned out.

1) As Luis notes, nothing sets HAVE_GDBM_H.  The thing is that there is
no version of gdbm.h that exports the NDBM interface.  Further, looking
at the git history, nothing has ever set HAVE_GDBM_H.  It was dead code
when it was committed initially as best I can tell.
2) openSUSE Tumbleweed doesn't need  at all.  It needs 
and this fix works because Luis added it to the HAVE_GDBM_H stanza.
3) AC_PACKAGE_WANT_NDBM used to check for  but it was a check
for IRIX and the caller was removed ages ago.  It wouldn't matter if it
were called anyway since libndbm is an IRIX library.  Linux, IIRC, has
never shipped a libndbm.

I'll post a few patches following this to clean it up and get it working
on SLES11.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

signature.asc
Description: OpenPGP digital signature

Re: [PATCH 4/9] build: update AC_PACKAGE_WANT_GDBM() and src/dbtest.c to build

2018-03-15 Thread Jeff Mahoney

On 12/14/17 12:51 AM, Eryu Guan wrote:
> On Tue, Dec 12, 2017 at 04:45:14PM -0800, Luis R. Rodriguez wrote:
>> Modern gdbm-devel packages bundle together gdbm.h and ndbm.h.
>> The old m4 macro had detection support for some old gdbm libraries
>> but not for new ones.
>>
>> We fix compilation of src/dbtest.c by making the autoconf helper
>> check for this new arrangement:
>>
>> If both gdbm.h and gdbm.h are found define set both gdbm_ndbm_=true,
>  ^^ ndbm.h?
>> and have_db=true, and define HAVE_GDBM_H. The src/dbtest.c already
>> had a HAVE_GDBM_H but there was never a respective autoconf settter for
>> it. We can just re-use this and fix it for new arrangement.
>>
>> Signed-off-by: Luis R. Rodriguez 
> 
> This looks fine to me.
> 
> The only system I have by hand that have both  and  but
> not any  is openSUSE Tumbleweed. Without this patch,
> dbtest was not built on openSUSE, and was built successfully with this
> patch applied. And dbtest is still built on RHEL6/7 and Fedora.
> 
> BTW, I'll queue patch 3 and this patch for next fstests release, while
> other patches seem not necessary, I agreed with Dave that groups are not
> for excluding tests, the required tools and environments should be
> detected by tests and _notrun if not met. (The README change looks fine,
> but it doesn't apply due to the "fsgqa-381" change, so I drop it too for
> now.)

Hi guys -

This change breaks on older releases like SLES 11 where both 
and  define datum, so we get build failures.  The failure is
new, but not because it used to pass and now doesn't.  It's apparently
never built on SLES releases since we ship /usr/include/ndbm.h and then
we notrun the test that uses.  Now that we're looking for gdbm.h and
find it, we attempt to build src/dbtest and fail.

This fix isn't the right solution.  The problem is that we have a couple
layers of old cruft that needs to be cleaned out.

1) As Luis notes, nothing sets HAVE_GDBM_H.  The thing is that there is
no version of gdbm.h that exports the NDBM interface.  Further, looking
at the git history, nothing has ever set HAVE_GDBM_H.  It was dead code
when it was committed initially as best I can tell.
2) openSUSE Tumbleweed doesn't need  at all.  It needs 
and this fix works because Luis added it to the HAVE_GDBM_H stanza.
3) AC_PACKAGE_WANT_NDBM used to check for  but it was a check
for IRIX and the caller was removed ages ago.  It wouldn't matter if it
were called anyway since libndbm is an IRIX library.  Linux, IIRC, has
never shipped a libndbm.

I'll post a few patches following this to clean it up and get it working
on SLES11.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

signature.asc
Description: OpenPGP digital signature

Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()

2017-08-23 Thread Jeff Mahoney

On 8/15/14 5:29 AM, Al Viro wrote:
> On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote:
> 
>> Christoph had noted that this seemed associated to the problem
>> that the btrfs uses different assignments for st_dev than s_dev,
>> but much as I'd like to see that changed based on discussions so
>> far its unclear if this is going to be possible unless strong
>> commitment is reached.

Resurrecting a dead thread since we've been carrying this patch anyway
since then.

> Explain, please.  Whose commitment and commitment to what, exactly?
> Having different ->st_dev values for different files on the same
> fs is a bloody bad idea; why does btrfs do that at all?  If nothing else,
> it breaks the usual "are those two files on the same fs?" tests...

It's because btrfs snapshots would have inode number collisions.
Changing the inode numbers for snapshots would negate a big benefit of
btrfs snapshots: the quick creation and lightweight on-disk
representation due to metadata sharing.

The thing is that ustat() used to work.  Your commit 0ee5dc676a5f8
(btrfs: kill magical embedded struct superblock) had a regression:
Since it replaced the superblock with a simple dev_t, it rendered the
device no longer discoverable by user_get_super.  We need a list_head to
attach for searching.

There's an argument that this is hacky.  It's valid.  The only other
feedback I've heard is to use a real superblock for subvolumes to do
this instead.  That doesn't work either, due to things like freeze/thaw
and inode writeback.  Ultimately, what we need is a single file system
with multiple namespaces.  Years ago we just needed different inode
namespaces, but as people have started adopting btrfs for containers, we
need more than that.  I've heard requests for per-subvolume security
contexts.  I'd imagine user namespaces are on someone's wish list.  A
working df can be done with ->d_automount, but the way btrfs handles
having a "canonical" subvolume location has always been a way to avoid
directory loops.  I'd like to just automount subvolumes everywhere
they're referenced.  One solution, for which I have no code yet, is to
have something like a superblock-light that we can hang things like a
security context, a user namespace, and an anonymous dev.  Most file
systems would have just one.  Btrfs would have one per subvolume.

That's a big project with a bunch of discussion.  So for now, I'd like
to move this patch forward while we (I) work on the bigger issue.

BTW, in this same thread, Christoph said:> Again, NAK.  Make btrfs
report the proper anon dev_t in stat and
> everything will just work.

We do.  We did then too.  But what doesn't work is a user doing stat()
and then using the dev_t to call ustat().

-Jeff

-- 
Jeff Mahoney
SUSE Labs

signature.asc
Description: OpenPGP digital signature

Re: [RFC v3 0/2] vfs / btrfs: add support for ustat()

2017-08-23 Thread Jeff Mahoney

On 8/15/14 5:29 AM, Al Viro wrote:
> On Thu, Aug 14, 2014 at 07:58:56PM -0700, Luis R. Rodriguez wrote:
> 
>> Christoph had noted that this seemed associated to the problem
>> that the btrfs uses different assignments for st_dev than s_dev,
>> but much as I'd like to see that changed based on discussions so
>> far its unclear if this is going to be possible unless strong
>> commitment is reached.

Resurrecting a dead thread since we've been carrying this patch anyway
since then.

> Explain, please.  Whose commitment and commitment to what, exactly?
> Having different ->st_dev values for different files on the same
> fs is a bloody bad idea; why does btrfs do that at all?  If nothing else,
> it breaks the usual "are those two files on the same fs?" tests...

It's because btrfs snapshots would have inode number collisions.
Changing the inode numbers for snapshots would negate a big benefit of
btrfs snapshots: the quick creation and lightweight on-disk
representation due to metadata sharing.

The thing is that ustat() used to work.  Your commit 0ee5dc676a5f8
(btrfs: kill magical embedded struct superblock) had a regression:
Since it replaced the superblock with a simple dev_t, it rendered the
device no longer discoverable by user_get_super.  We need a list_head to
attach for searching.

There's an argument that this is hacky.  It's valid.  The only other
feedback I've heard is to use a real superblock for subvolumes to do
this instead.  That doesn't work either, due to things like freeze/thaw
and inode writeback.  Ultimately, what we need is a single file system
with multiple namespaces.  Years ago we just needed different inode
namespaces, but as people have started adopting btrfs for containers, we
need more than that.  I've heard requests for per-subvolume security
contexts.  I'd imagine user namespaces are on someone's wish list.  A
working df can be done with ->d_automount, but the way btrfs handles
having a "canonical" subvolume location has always been a way to avoid
directory loops.  I'd like to just automount subvolumes everywhere
they're referenced.  One solution, for which I have no code yet, is to
have something like a superblock-light that we can hang things like a
security context, a user namespace, and an anonymous dev.  Most file
systems would have just one.  Btrfs would have one per subvolume.

That's a big project with a bunch of discussion.  So for now, I'd like
to move this patch forward while we (I) work on the bigger issue.

BTW, in this same thread, Christoph said:> Again, NAK.  Make btrfs
report the proper anon dev_t in stat and
> everything will just work.

We do.  We did then too.  But what doesn't work is a user doing stat()
and then using the dev_t to call ustat().

-Jeff

-- 
Jeff Mahoney
SUSE Labs

signature.asc
Description: OpenPGP digital signature

Re: [PATCH 3/5] btrfs: Use common error handling code in btrfs_update_root()

2017-08-21 Thread Jeff Mahoney

On 8/21/17 8:40 AM, SF Markus Elfring wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Mon, 21 Aug 2017 13:10:15 +0200
> 
> Add a jump target so that a bit of exception handling can be better reused
> at the end of this function.
> 
> This issue was detected by using the Coccinelle software.

btrfs_abort_transaction dumps __FILE__:__LINE__ in the log so this patch
makes the code more difficult to debug.

-Jeff

> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>
> ---
>  fs/btrfs/root-tree.c | 27 +++
>  1 file changed, 11 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> index 5b488af6f25e..bc497ba9d9d1 100644
> --- a/fs/btrfs/root-tree.c
> +++ b/fs/btrfs/root-tree.c
> @@ -145,10 +145,8 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>   return -ENOMEM;
>  
>   ret = btrfs_search_slot(trans, root, key, path, 0, 1);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>  
>   if (ret != 0) {
>   btrfs_print_leaf(path->nodes[0]);
> @@ -171,23 +169,17 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>   btrfs_release_path(path);
>   ret = btrfs_search_slot(trans, root, key, path,
>   -1, 1);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>  
>   ret = btrfs_del_item(trans, root, path);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>   btrfs_release_path(path);
>   ret = btrfs_insert_empty_item(trans, root, path,
>   key, sizeof(*item));
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>   l = path->nodes[0];
>   slot = path->slots[0];
>   ptr = btrfs_item_ptr_offset(l, slot);
> @@ -204,6 +196,9 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>  out:
>   btrfs_free_path(path);
>   return ret;
> +abort_transaction:
> + btrfs_abort_transaction(trans, ret);
> + goto out;
>  }
>  
>  int btrfs_insert_root(struct btrfs_trans_handle *trans, struct btrfs_root 
> *root,
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 3/5] btrfs: Use common error handling code in btrfs_update_root()

2017-08-21 Thread Jeff Mahoney

On 8/21/17 8:40 AM, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Mon, 21 Aug 2017 13:10:15 +0200
> 
> Add a jump target so that a bit of exception handling can be better reused
> at the end of this function.
> 
> This issue was detected by using the Coccinelle software.

btrfs_abort_transaction dumps __FILE__:__LINE__ in the log so this patch
makes the code more difficult to debug.

-Jeff

> Signed-off-by: Markus Elfring 
> ---
>  fs/btrfs/root-tree.c | 27 +++
>  1 file changed, 11 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> index 5b488af6f25e..bc497ba9d9d1 100644
> --- a/fs/btrfs/root-tree.c
> +++ b/fs/btrfs/root-tree.c
> @@ -145,10 +145,8 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>   return -ENOMEM;
>  
>   ret = btrfs_search_slot(trans, root, key, path, 0, 1);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>  
>   if (ret != 0) {
>   btrfs_print_leaf(path->nodes[0]);
> @@ -171,23 +169,17 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>   btrfs_release_path(path);
>   ret = btrfs_search_slot(trans, root, key, path,
>   -1, 1);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>  
>   ret = btrfs_del_item(trans, root, path);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>   btrfs_release_path(path);
>   ret = btrfs_insert_empty_item(trans, root, path,
>   key, sizeof(*item));
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>   l = path->nodes[0];
>   slot = path->slots[0];
>   ptr = btrfs_item_ptr_offset(l, slot);
> @@ -204,6 +196,9 @@ int btrfs_update_root(struct btrfs_trans_handle *trans, 
> struct btrfs_root
>  out:
>   btrfs_free_path(path);
>   return ret;
> +abort_transaction:
> + btrfs_abort_transaction(trans, ret);
> + goto out;
>  }
>  
>  int btrfs_insert_root(struct btrfs_trans_handle *trans, struct btrfs_root 
> *root,
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 4/5] btrfs: Use common error handling code in update_ref_path()

2017-08-21 Thread Jeff Mahoney

On 8/21/17 8:41 AM, SF Markus Elfring wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Mon, 21 Aug 2017 13:34:29 +0200
> 
> Add a jump target so that a bit of exception handling can be better reused
> in this function.
> 
> This issue was detected by using the Coccinelle software.

Adding a jump label in the middle of a conditional for "common" error
handling makes the code more difficult to understand.

-Jeff

> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>
> ---
>  fs/btrfs/send.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index 59fb1ed6ca20..a96edc91a101 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -3697,12 +3697,12 @@ static int update_ref_path(struct send_ctx *sctx, 
> struct recorded_ref *ref)
>   return -ENOMEM;
>  
>   ret = get_cur_path(sctx, ref->dir, ref->dir_gen, new_path);
> - if (ret < 0) {
> - fs_path_free(new_path);
> - return ret;
> - }
> + if (ret < 0)
> + goto free_path;
> +
>   ret = fs_path_add(new_path, ref->name, ref->name_len);
>   if (ret < 0) {
> +free_path:
>   fs_path_free(new_path);
>   return ret;
>   }
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 4/5] btrfs: Use common error handling code in update_ref_path()

2017-08-21 Thread Jeff Mahoney

On 8/21/17 8:41 AM, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Mon, 21 Aug 2017 13:34:29 +0200
> 
> Add a jump target so that a bit of exception handling can be better reused
> in this function.
> 
> This issue was detected by using the Coccinelle software.

Adding a jump label in the middle of a conditional for "common" error
handling makes the code more difficult to understand.

-Jeff

> Signed-off-by: Markus Elfring 
> ---
>  fs/btrfs/send.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index 59fb1ed6ca20..a96edc91a101 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -3697,12 +3697,12 @@ static int update_ref_path(struct send_ctx *sctx, 
> struct recorded_ref *ref)
>   return -ENOMEM;
>  
>   ret = get_cur_path(sctx, ref->dir, ref->dir_gen, new_path);
> - if (ret < 0) {
> - fs_path_free(new_path);
> - return ret;
> - }
> + if (ret < 0)
> + goto free_path;
> +
>   ret = fs_path_add(new_path, ref->name, ref->name_len);
>   if (ret < 0) {
> +free_path:
>   fs_path_free(new_path);
>   return ret;
>   }
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 2/5] btrfs: Use common error handling code in __btrfs_free_extent()

2017-08-21 Thread Jeff Mahoney

, _ref);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>   } else {
>   if (found_extent) {
> @@ -7078,37 +7067,33 @@ static int __btrfs_free_extent(struct 
> btrfs_trans_handle *trans,
>   last_ref = 1;
>   ret = btrfs_del_items(trans, extent_root, path, path->slots[0],
> num_to_del);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
> +
>   btrfs_release_path(path);
>  
>   if (is_data) {
>   ret = btrfs_del_csums(trans, info, bytenr, num_bytes);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>  
>   ret = add_to_free_space_tree(trans, info, bytenr, num_bytes);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>  
>   ret = update_block_group(trans, info, bytenr, num_bytes, 0);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>   btrfs_release_path(path);
>  
>  out:
>   btrfs_free_path(path);
>   return ret;
> +abort_transaction:
> + btrfs_abort_transaction(trans, ret);
> + goto out;
>  }
>  
>  /*
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 2/5] btrfs: Use common error handling code in __btrfs_free_extent()

2017-08-21 Thread Jeff Mahoney

   btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>   } else {
>   if (found_extent) {
> @@ -7078,37 +7067,33 @@ static int __btrfs_free_extent(struct 
> btrfs_trans_handle *trans,
>   last_ref = 1;
>   ret = btrfs_del_items(trans, extent_root, path, path->slots[0],
> num_to_del);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
> +
>   btrfs_release_path(path);
>  
>   if (is_data) {
>   ret = btrfs_del_csums(trans, info, bytenr, num_bytes);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>  
>   ret = add_to_free_space_tree(trans, info, bytenr, num_bytes);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>  
>   ret = update_block_group(trans, info, bytenr, num_bytes, 0);
> - if (ret) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>   btrfs_release_path(path);
>  
>  out:
>   btrfs_free_path(path);
>   return ret;
> +abort_transaction:
> + btrfs_abort_transaction(trans, ret);
> + goto out;
>  }
>  
>  /*
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 5/5] btrfs: Use common error handling code in btrfs_mark_extent_written()

2017-08-21 Thread Jeff Mahoney

bort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret)
> + goto abort_transaction;
>   }
>   if (del_nr == 0) {
>   fi = btrfs_item_ptr(leaf, path->slots[0],
> @@ -1314,14 +1297,17 @@ int btrfs_mark_extent_written(struct 
> btrfs_trans_handle *trans,
>   btrfs_mark_buffer_dirty(leaf);
>  
>   ret = btrfs_del_items(trans, root, path, del_slot, del_nr);
> -     if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>   }
>  out:
>   btrfs_free_path(path);
>   return 0;
> +e_inval:
> + ret = -EINVAL;
> +abort_transaction:
> + btrfs_abort_transaction(trans, ret);
> + goto out;
>  }
>  
>  /*
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 5/5] btrfs: Use common error handling code in btrfs_mark_extent_written()

2017-08-21 Thread Jeff Mahoney

  }
> + if (ret)
> + goto abort_transaction;
>   }
>   if (del_nr == 0) {
>   fi = btrfs_item_ptr(leaf, path->slots[0],
> @@ -1314,14 +1297,17 @@ int btrfs_mark_extent_written(struct 
> btrfs_trans_handle *trans,
>   btrfs_mark_buffer_dirty(leaf);
>  
>   ret = btrfs_del_items(trans, root, path, del_slot, del_nr);
> - if (ret < 0) {
> - btrfs_abort_transaction(trans, ret);
> - goto out;
> - }
> + if (ret < 0)
> + goto abort_transaction;
>   }
>  out:
>   btrfs_free_path(path);
>   return 0;
> +e_inval:
> + ret = -EINVAL;
> +abort_transaction:
> + btrfs_abort_transaction(trans, ret);
> + goto out;
>  }
>  
>  /*
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-10 Thread Jeff Mahoney

On 8/6/17 9:59 AM, Thorsten Leemhuis wrote:
> Hi! Find below my second regression report for Linux 4.13. It lists 10
> regressions I'm currently aware of (albeit in one case it's not entirely
> clear yet if it's a regression in 4.13). One regression got fixed since
> last weeks report. You can also find the report at
> http://bit.ly/lnxregrep413 where I try to update it every now and then.
> 
> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid And please tell me if
> there is anything in the report that shouldn't be there.
> 
> Ciao, Thorsten
> 
> P.S.: Thx to all those that CCed me on regression reports or provided
> other input, it makes compiling these reports a whole lot easier!
> 
> == Current regressions ==
> 
> [x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
> (2017-07-10)
> http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
> Status: Asked on the list, but issue still gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/e585513b76
> Note: I'm a bit unsure if adding this issue to this list was a good idea.
> 
> Null dereference in rt5677_i2c_probe() (2017-07-17)
> https://bugzilla.kernel.org/show_bug.cgi?id=196397
> Linux-Regression-ID: lr#96bd63
> Status: Patch is available in in asoc-next as commit ddc9e69b9dc2, but
> was not part of the changes to this subsystem that got merged a few days ago
> Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
> Latest: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6 (2017-07-17)
> 
> [I945GM] Pasted text not shown after mouse middle-click (2017-07-17)
> https://bugs.freedesktop.org/show_bug.cgi?id=101819
> Linux-Regression-ID: lr#d672f3
> Status: could not get reproduced yet
> Note: looks like it's getting ignored
> Latest: https://bugs.freedesktop.org/show_bug.cgi?id=101819#c8 (2017-07-17)
> 
> [Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
> keyboard (2017-07-24)
> https://bugzilla.kernel.org/show_bug.cgi?id=196459
> Linux-Regression-ID: lr#bd29ab
> Status: it's a tracking bug, looks like issue is handled by Intel devs
> already
> Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
> Note: suspend-to-idle is rare
> 
> [lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail (2017-07-26)
> https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
> Linux-Regression-ID: lr#a7d273
> Status: Seems it gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/28785f70ef

We're not ignoring it.  It's that this part of allocation seems to be a
collection of bugs that approximate a correct result, and we're
addressing them individually.  This patch by itself is correct but
uncovered a couple of underlying issues.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: Linux 4.13: Reported regressions as of Sunday, 2017-08-06

2017-08-10 Thread Jeff Mahoney

On 8/6/17 9:59 AM, Thorsten Leemhuis wrote:
> Hi! Find below my second regression report for Linux 4.13. It lists 10
> regressions I'm currently aware of (albeit in one case it's not entirely
> clear yet if it's a regression in 4.13). One regression got fixed since
> last weeks report. You can also find the report at
> http://bit.ly/lnxregrep413 where I try to update it every now and then.
> 
> As always: Are you aware of any other regressions? Then please let me
> know. For details see http://bit.ly/lnxregtrackid And please tell me if
> there is anything in the report that shouldn't be there.
> 
> Ciao, Thorsten
> 
> P.S.: Thx to all those that CCed me on regression reports or provided
> other input, it makes compiling these reports a whole lot easier!
> 
> == Current regressions ==
> 
> [x86/mm/gup] e585513b76: will-it-scale.per_thread_ops -6.9% regression
> (2017-07-10)
> http://lkml.kernel.org/r/20170710024020.GA26389@yexl-desktop
> Status: Asked on the list, but issue still gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/e585513b76
> Note: I'm a bit unsure if adding this issue to this list was a good idea.
> 
> Null dereference in rt5677_i2c_probe() (2017-07-17)
> https://bugzilla.kernel.org/show_bug.cgi?id=196397
> Linux-Regression-ID: lr#96bd63
> Status: Patch is available in in asoc-next as commit ddc9e69b9dc2, but
> was not part of the changes to this subsystem that got merged a few days ago
> Cause: https://git.kernel.org/torvalds/c/a36afb0ab6
> Latest: https://bugzilla.kernel.org/show_bug.cgi?id=196397#c6 (2017-07-17)
> 
> [I945GM] Pasted text not shown after mouse middle-click (2017-07-17)
> https://bugs.freedesktop.org/show_bug.cgi?id=101819
> Linux-Regression-ID: lr#d672f3
> Status: could not get reproduced yet
> Note: looks like it's getting ignored
> Latest: https://bugs.freedesktop.org/show_bug.cgi?id=101819#c8 (2017-07-17)
> 
> [Dell xps13 9630] Could not be woken up from suspend-to-idle via usb
> keyboard (2017-07-24)
> https://bugzilla.kernel.org/show_bug.cgi?id=196459
> Linux-Regression-ID: lr#bd29ab
> Status: it's a tracking bug, looks like issue is handled by Intel devs
> already
> Cause: https://git.kernel.org/torvalds/c/33e4f80ee6
> Note: suspend-to-idle is rare
> 
> [lkp-robot] [Btrfs]  28785f70ef: xfstests.generic.273.fail (2017-07-26)
> https://lkml.kernel.org/r/20170726062352.GC4877@yexl-desktop
> Linux-Regression-ID: lr#a7d273
> Status: Seems it gets ignored by everyone
> Cause: https://git.kernel.org/torvalds/c/28785f70ef

We're not ignoring it.  It's that this part of allocation seems to be a
collection of bugs that approximate a correct result, and we're
addressing them individually.  This patch by itself is correct but
uncovered a couple of underlying issues.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 1/1] - Fix reiserfs WARNING in dquot_writeback_dquots

2017-06-22 Thread Jeff Mahoney

ting and unmounting works as expected
>> without issue.

I suspect this is because you aren't doing any of the things that would
conflict here.  Remounting, freeze/thaw, or really anything that takes
->s_umount as a writer running in a different thread would cause problems.

> --- a/fs/reiserfs/super.c 2017-05-29 00:14:45.0 -0400
> +++ b/fs/reiserfs/super.c 2017-05-29 00:51:44.0 -0400
> @@ -67,17 +67,28 @@
>  static int reiserfs_sync_fs(struct super_block *s, int wait)
>  {
>   struct reiserfs_transaction_handle th;
> + int got_lock;
>  
>   /*
>* Writeback quota in non-journalled quota case - journalled quota has
>* no dirty dquots
>*/
> +
> + if ( down_read_trylock(>s_umount) )
> + got_lock = 1;
> + else
> + got_lock = 0;
> +
> 

This is pretty much the same as not using the lock.  The lock is a requirement 
to
continue and assuming that if we can't get the lock that we must already be 
holding
it is incorrect.  There may be other threads executing with s->s_umount held as 
writers
and this patch means that this would execute concurrently, which is incorrect.

>   dquot_writeback_dquots(s, -1);
>   reiserfs_write_lock(s);
>   if (!journal_begin(, s, 1))
>   if (!journal_end_sync())
>   reiserfs_flush_old_commits(s);
>   reiserfs_write_unlock(s);
> +
> + if ( got_lock )
> + up_read(>s_umount);
> +
>   return 0;
>  }

Please follow the surrounding style.  Spaces within the conditional are
not part of the accepted style[1].

-Jeff

[1] 
https://github.com/torvalds/linux/blob/master/Documentation/process/coding-style.rst

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 1/1] - Fix reiserfs WARNING in dquot_writeback_dquots

2017-06-22 Thread Jeff Mahoney

ause you aren't doing any of the things that would
conflict here.  Remounting, freeze/thaw, or really anything that takes
->s_umount as a writer running in a different thread would cause problems.

> --- a/fs/reiserfs/super.c 2017-05-29 00:14:45.0 -0400
> +++ b/fs/reiserfs/super.c 2017-05-29 00:51:44.0 -0400
> @@ -67,17 +67,28 @@
>  static int reiserfs_sync_fs(struct super_block *s, int wait)
>  {
>   struct reiserfs_transaction_handle th;
> + int got_lock;
>  
>   /*
>* Writeback quota in non-journalled quota case - journalled quota has
>* no dirty dquots
>*/
> +
> + if ( down_read_trylock(>s_umount) )
> + got_lock = 1;
> + else
> + got_lock = 0;
> +
> 

This is pretty much the same as not using the lock.  The lock is a requirement 
to
continue and assuming that if we can't get the lock that we must already be 
holding
it is incorrect.  There may be other threads executing with s->s_umount held as 
writers
and this patch means that this would execute concurrently, which is incorrect.

>   dquot_writeback_dquots(s, -1);
>   reiserfs_write_lock(s);
>       if (!journal_begin(, s, 1))
>   if (!journal_end_sync())
>   reiserfs_flush_old_commits(s);
>   reiserfs_write_unlock(s);
> +
> + if ( got_lock )
> + up_read(>s_umount);
> +
>   return 0;
>  }

Please follow the surrounding style.  Spaces within the conditional are
not part of the accepted style[1].

-Jeff

[1] 
https://github.com/torvalds/linux/blob/master/Documentation/process/coding-style.rst

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: __link_block_group uses GFP_KERNEL

2017-03-24 Thread Jeff Mahoney

On 3/24/17 5:02 AM, Denis Kirjanov wrote:
> Hi guys,
> 
> Looks like that current code does GFP_KERNEL allocation inside
> __link_block_group.
> the function invokes kobject_add and internally creates sysfs files
> with the GFP_KERNEL flag set.

Yep, that's a bug.

> But since do_chunk_alloc executes insides the btrfs transaction it's
> not allowed to sleep.

It's allowed to sleep but isn't allowed to do reclaim that involves file
system writeback.  Michal Hocko's allocation context idea would fix
this, but it's not there yet, so we'll need to defer the kobject_add
until we can use GFP_KERNEL.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

signature.asc
Description: OpenPGP digital signature

Re: __link_block_group uses GFP_KERNEL

2017-03-24 Thread Jeff Mahoney

On 3/24/17 5:02 AM, Denis Kirjanov wrote:
> Hi guys,
> 
> Looks like that current code does GFP_KERNEL allocation inside
> __link_block_group.
> the function invokes kobject_add and internally creates sysfs files
> with the GFP_KERNEL flag set.

Yep, that's a bug.

> But since do_chunk_alloc executes insides the btrfs transaction it's
> not allowed to sleep.

It's allowed to sleep but isn't allowed to do reclaim that involves file
system writeback.  Michal Hocko's allocation context idea would fix
this, but it's not there yet, so we'll need to defer the kobject_add
until we can use GFP_KERNEL.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

signature.asc
Description: OpenPGP digital signature

Re: [PATCH] uapi: fix linux/btrfs.h userspace compilation error

2017-02-15 Thread Jeff Mahoney

On 2/15/17 3:02 PM, Dmitry V. Levin wrote:
> Stop using NULL to fix the following linux/btrfs.h userspace compilation
> error:
> 
> /usr/include/linux/btrfs.h: In function 'btrfs_err_str':
> /usr/include/linux/btrfs.h:740:11: error: 'NULL' undeclared (first use in 
> this function)
> return NULL;
> 
> Signed-off-by: Dmitry V. Levin <l...@altlinux.org>
> ---
>  include/uapi/linux/btrfs.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
> index db4c253..01c612f 100644
> --- a/include/uapi/linux/btrfs.h
> +++ b/include/uapi/linux/btrfs.h
> @@ -737,7 +737,7 @@ static inline char *btrfs_err_str(enum btrfs_err_code 
> err_code)
>   return "add/delete/balance/replace/resize operation "\
>   "in progress";
>   default:
> - return NULL;
> + return 0;
>   }
>  }
>  
> 

Hi Dmitry -

Just remove the whole routine.  It's not called from anywhere and is
replicated in the userspace headers for btrfsprogs.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] uapi: fix linux/btrfs.h userspace compilation error

2017-02-15 Thread Jeff Mahoney

On 2/15/17 3:02 PM, Dmitry V. Levin wrote:
> Stop using NULL to fix the following linux/btrfs.h userspace compilation
> error:
> 
> /usr/include/linux/btrfs.h: In function 'btrfs_err_str':
> /usr/include/linux/btrfs.h:740:11: error: 'NULL' undeclared (first use in 
> this function)
> return NULL;
> 
> Signed-off-by: Dmitry V. Levin 
> ---
>  include/uapi/linux/btrfs.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
> index db4c253..01c612f 100644
> --- a/include/uapi/linux/btrfs.h
> +++ b/include/uapi/linux/btrfs.h
> @@ -737,7 +737,7 @@ static inline char *btrfs_err_str(enum btrfs_err_code 
> err_code)
>   return "add/delete/balance/replace/resize operation "\
>   "in progress";
>   default:
> - return NULL;
> + return 0;
>   }
>  }
>  
> 

Hi Dmitry -

Just remove the whole routine.  It's not called from anywhere and is
replicated in the userspace headers for btrfsprogs.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [Regression 4.7-rc1] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2017-02-06 Thread Jeff Mahoney

On 1/9/17 6:28 AM, David Sterba wrote:
> On Fri, Jan 06, 2017 at 12:22:34PM -0500, Joseph Salisbury wrote:
>> A kernel bug report was opened against Ubuntu [0].  This bug was fixed
>> by the following commit in v4.7-rc1:
>>
>> commit 4c63c2454eff996c5e27991221106eb511f7db38
>>
>> Author: Luke Dashjr <l...@dashjr.org>
>> Date:   Thu Oct 29 08:22:21 2015 +
>>
>> btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in
>> btrfs_ioctl
>>
>>
>> However, this commit introduced a new regression.  With this commit
>> applied, "btrfs fi show" no longer works and the btrfs snapshot
>> functionality breaks.
> 
> A plain 32bit kernel with 32bit userspace works fine. The bug seems to
> be on a 64bit kernel with 32bit userspace and the CONFIG_COMPAT compiled
> in. Strace does not show anything special:
> 
> stat64("subv1", 0xffc64fcc) = -1 ENOENT (No such file or 
> directory)
> statfs64(".", 84, {f_type=BTRFS_SUPER_MAGIC, f_bsize=4096, f_blocks=5110784, 
> f_bfree=4076143, f_bavail=4010951, f_files=0, f_ffree=0, f_fsid={val=[
> 4260464218, 2297804334]}, f_namelen=255, f_frsize=4096, 
> f_flags=ST_VALID|ST_RELATIME}) = 0
> stat64(".", {st_mode=S_IFDIR|0700, st_size=228, ...}) = 0
> stat64(".", {st_mode=S_IFDIR|0700, st_size=228, ...}) = 0
> open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
> fstat64(3, {st_mode=S_IFDIR|0700, st_size=228, ...}) = 0
> fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
> write(1, "Create subvolume './subv1'\n", 27) = 27
> ioctl(3, BTRFS_IOC_SUBVOL_CREATE, {fd=0, name="subv1"}) = -1 ENOTTY 
> (Inappropriate ioctl for device)
> 
> The value of BTRFS_IOC_SUBVOL_CREATE is same on 32bit and 64bit kernels.
> As it returns ENOTTY, the value is not recognized. A candidate function
> is btrfs_compat_ioctl that checks for just the IOC32 numbers and returns
> -ENOIOCTLCMD otherwise.
> 
> The callchain in fs/compat_ioctl.c:ioctl cheks for the specific callback
> first, if it retunrs -ENOIOCTLCMD then goes to the normal ioctl
> callback, so there's always a point we reach the handler of
> BTRFS_IOC_SUBVOL_CREATE. So I don't see how it could happen.

I got a minute to look at this today.  It doesn't fallback to the normal
ioctl handler.  Since we have ->unlocked_ioctl defined, it checks the
hard-coded compat tables and then fails with -ENOTTY.

Fortunately, the fix is simple.  The default stanza of that switch
statement in btrfs_compat_ioctl needs to be removed.  Then it just
works.  I'll post a fix momentarily.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [Regression 4.7-rc1] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2017-02-06 Thread Jeff Mahoney

On 1/9/17 6:28 AM, David Sterba wrote:
> On Fri, Jan 06, 2017 at 12:22:34PM -0500, Joseph Salisbury wrote:
>> A kernel bug report was opened against Ubuntu [0].  This bug was fixed
>> by the following commit in v4.7-rc1:
>>
>> commit 4c63c2454eff996c5e27991221106eb511f7db38
>>
>> Author: Luke Dashjr 
>> Date:   Thu Oct 29 08:22:21 2015 +
>>
>> btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in
>> btrfs_ioctl
>>
>>
>> However, this commit introduced a new regression.  With this commit
>> applied, "btrfs fi show" no longer works and the btrfs snapshot
>> functionality breaks.
> 
> A plain 32bit kernel with 32bit userspace works fine. The bug seems to
> be on a 64bit kernel with 32bit userspace and the CONFIG_COMPAT compiled
> in. Strace does not show anything special:
> 
> stat64("subv1", 0xffc64fcc) = -1 ENOENT (No such file or 
> directory)
> statfs64(".", 84, {f_type=BTRFS_SUPER_MAGIC, f_bsize=4096, f_blocks=5110784, 
> f_bfree=4076143, f_bavail=4010951, f_files=0, f_ffree=0, f_fsid={val=[
> 4260464218, 2297804334]}, f_namelen=255, f_frsize=4096, 
> f_flags=ST_VALID|ST_RELATIME}) = 0
> stat64(".", {st_mode=S_IFDIR|0700, st_size=228, ...}) = 0
> stat64(".", {st_mode=S_IFDIR|0700, st_size=228, ...}) = 0
> open(".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
> fstat64(3, {st_mode=S_IFDIR|0700, st_size=228, ...}) = 0
> fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
> write(1, "Create subvolume './subv1'\n", 27) = 27
> ioctl(3, BTRFS_IOC_SUBVOL_CREATE, {fd=0, name="subv1"}) = -1 ENOTTY 
> (Inappropriate ioctl for device)
> 
> The value of BTRFS_IOC_SUBVOL_CREATE is same on 32bit and 64bit kernels.
> As it returns ENOTTY, the value is not recognized. A candidate function
> is btrfs_compat_ioctl that checks for just the IOC32 numbers and returns
> -ENOIOCTLCMD otherwise.
> 
> The callchain in fs/compat_ioctl.c:ioctl cheks for the specific callback
> first, if it retunrs -ENOIOCTLCMD then goes to the normal ioctl
> callback, so there's always a point we reach the handler of
> BTRFS_IOC_SUBVOL_CREATE. So I don't see how it could happen.

I got a minute to look at this today.  It doesn't fallback to the normal
ioctl handler.  Since we have ->unlocked_ioctl defined, it checks the
hard-coded compat tables and then fails with -ENOTTY.

Fortunately, the fix is simple.  The default stanza of that switch
statement in btrfs_compat_ioctl needs to be removed.  Then it just
works.  I'll post a fix momentarily.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

[PATCH] Docbook: re-remove 80211.xml from Makefile

2016-12-29 Thread Jeff Mahoney

Merge commit 726d661fea3 (Merge remote-tracking branch 
'sound/topic/restize-docs' into sound)
accidentally re-added 80211.xml to the DocBook Makefile, but
commit 819bf593767 (docs-rst: sphinxify 802.11 documentation) removed
80211.tmpl and the rule from the Makefile.  As a result, we get build
failures for docs:
make[2]: *** No rule to make target 'Documentation/DocBook/80211.xml', needed 
by 'Documentation/DocBook/80211.9'.  Stop.

This patch re-removes the 80211 rule.

Fixes: 726d661fea3 (Merge remote-tracking branch 'sound/topic/restize-docs' 
into sound)
Signed-off-by: Jeff Mahoney <je...@suse.com>
---
 Documentation/DocBook/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/DocBook/Makefile
+++ b/Documentation/DocBook/Makefile
@@ -12,7 +12,7 @@ DOCBOOKS := z8530book.xml  \
kernel-api.xml filesystems.xml lsm.xml kgdb.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
-   80211.xml sh.xml regulator.xml w1.xml \
+   sh.xml regulator.xml w1.xml \
writing_musb_glue_layer.xml iio.xml
 
 ifeq ($(DOCBOOKS),)

-- 
Jeff Mahoney
SUSE Labs

[PATCH] Docbook: re-remove 80211.xml from Makefile

2016-12-29 Thread Jeff Mahoney

Merge commit 726d661fea3 (Merge remote-tracking branch 
'sound/topic/restize-docs' into sound)
accidentally re-added 80211.xml to the DocBook Makefile, but
commit 819bf593767 (docs-rst: sphinxify 802.11 documentation) removed
80211.tmpl and the rule from the Makefile.  As a result, we get build
failures for docs:
make[2]: *** No rule to make target 'Documentation/DocBook/80211.xml', needed 
by 'Documentation/DocBook/80211.9'.  Stop.

This patch re-removes the 80211 rule.

Fixes: 726d661fea3 (Merge remote-tracking branch 'sound/topic/restize-docs' 
into sound)
Signed-off-by: Jeff Mahoney 
---
 Documentation/DocBook/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/DocBook/Makefile
+++ b/Documentation/DocBook/Makefile
@@ -12,7 +12,7 @@ DOCBOOKS := z8530book.xml  \
kernel-api.xml filesystems.xml lsm.xml kgdb.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
-   80211.xml sh.xml regulator.xml w1.xml \
+   sh.xml regulator.xml w1.xml \
writing_musb_glue_layer.xml iio.xml
 
 ifeq ($(DOCBOOKS),)

-- 
Jeff Mahoney
SUSE Labs

Re: [PATCH] btrfs: fix dereference on inode->i_sb before inode null check

2016-12-16 Thread Jeff Mahoney

On 12/16/16 7:20 AM, Colin King wrote:
> From: Colin Ian King <colin.k...@canonical.com>
> 
> inode is being deferenced and then inode is checked to see if it
> is null, implying we potentially could have a null pointer deference
> on inode.
> 
> Found with static analysis by CoverityScan, CID 1389472
> 
> Fix this by dereferencing inode only after the inode null check.
> 
> Fixes: 0b246afa62b0cf5 ("btrfs: root->fs_info cleanup, add fs_info 
> convenience variables")

Hi Colin -

Thanks for the review.  The right fix here is to eliminate the tests for
inode == NULL entirely.  This is a callback for exportfs, which will
itself crash if dentry->d_inode or parent->d_inode is NULL.  Removing
the tests would be consistent with other file systems.

-Jeff

> Signed-off-by: Colin Ian King <colin.k...@canonical.com>
> ---
>  fs/btrfs/export.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
> index 340d907..b746d2b 100644
> --- a/fs/btrfs/export.c
> +++ b/fs/btrfs/export.c
> @@ -223,7 +223,7 @@ static int btrfs_get_name(struct dentry *parent, char 
> *name,
>  {
>   struct inode *inode = d_inode(child);
>   struct inode *dir = d_inode(parent);
> - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> + struct btrfs_fs_info *fs_info;
>   struct btrfs_path *path;
>   struct btrfs_root *root = BTRFS_I(dir)->root;
>   struct btrfs_inode_ref *iref;
> @@ -241,6 +241,7 @@ static int btrfs_get_name(struct dentry *parent, char 
> *name,
>   if (!S_ISDIR(dir->i_mode))
>   return -EINVAL;
>  
> + fs_info = btrfs_sb(inode->i_sb);
>   ino = btrfs_ino(inode);
>  
>   path = btrfs_alloc_path();
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] btrfs: fix dereference on inode->i_sb before inode null check

2016-12-16 Thread Jeff Mahoney

On 12/16/16 7:20 AM, Colin King wrote:
> From: Colin Ian King 
> 
> inode is being deferenced and then inode is checked to see if it
> is null, implying we potentially could have a null pointer deference
> on inode.
> 
> Found with static analysis by CoverityScan, CID 1389472
> 
> Fix this by dereferencing inode only after the inode null check.
> 
> Fixes: 0b246afa62b0cf5 ("btrfs: root->fs_info cleanup, add fs_info 
> convenience variables")

Hi Colin -

Thanks for the review.  The right fix here is to eliminate the tests for
inode == NULL entirely.  This is a callback for exportfs, which will
itself crash if dentry->d_inode or parent->d_inode is NULL.  Removing
the tests would be consistent with other file systems.

-Jeff

> Signed-off-by: Colin Ian King 
> ---
>  fs/btrfs/export.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
> index 340d907..b746d2b 100644
> --- a/fs/btrfs/export.c
> +++ b/fs/btrfs/export.c
> @@ -223,7 +223,7 @@ static int btrfs_get_name(struct dentry *parent, char 
> *name,
>  {
>   struct inode *inode = d_inode(child);
>   struct inode *dir = d_inode(parent);
> - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> + struct btrfs_fs_info *fs_info;
>   struct btrfs_path *path;
>   struct btrfs_root *root = BTRFS_I(dir)->root;
>   struct btrfs_inode_ref *iref;
> @@ -241,6 +241,7 @@ static int btrfs_get_name(struct dentry *parent, char 
> *name,
>   if (!S_ISDIR(dir->i_mode))
>   return -EINVAL;
>  
> + fs_info = btrfs_sb(inode->i_sb);
>   ino = btrfs_ino(inode);
>  
>   path = btrfs_alloc_path();
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Revert "Btrfs: don't delay inode ref updates during log, replay"

2016-12-02 Thread Jeff Mahoney

Whoops, the [PATCH] line should've specified more clearly:  This only
applies to linux-stable, 3.12.y.

Sorry for any confusion.

-Jeff

On 12/2/16 10:21 PM, Jeff Mahoney wrote:
> This reverts commit 644d10716875b24388680925d6c7502420987bfe.
> 
> The original patch for mainline, 6f8960541b1 (Btrfs: don't delay
> inode ref updates during log replay) lists 1d52c78afbb (Btrfs: try
> not to ENOSPC on log replay) as the only pre-3.18 dependency, but it
> also depends on 67de11769bd (Btrfs: introduce the delayed inode ref
> deletion for the single link inode), which was introduced in 3.14
> and isn't in 3.12.y.
> 
> The -stable commit added the check to btrfs_delayed_update_inode,
> which may look similar to btrfs_delayed_delete_inode_ref, but it's
> only superficial.  The tops of both functions handle typical
> delayed node boilerplate.  The upshot is that the patch is harmless
> since the caller already checks to see if we're doing log recovery,
> so we're not breaking anything.  It should be reverted because it
> makes it appear as if this issue was fixed for users who did
> backport 67de11769bd, when it is not.
> 
> Signed-off-by: Jeff Mahoney <je...@suse.com>
> ---
>  fs/btrfs/delayed-inode.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index 34f33e1..269ac79 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -1805,14 +1805,6 @@ int btrfs_delayed_update_inode(struct 
> btrfs_trans_handle *trans,
>   struct btrfs_delayed_node *delayed_node;
>   int ret = 0;
>  
> - /*
> -  * we don't do delayed inode updates during log recovery because it
> -  * leads to enospc problems.  This means we also can't do
> -  * delayed inode refs
> -  */
> - if (BTRFS_I(inode)->root->fs_info->log_root_recovering)
> - return -EAGAIN;
> -
>   delayed_node = btrfs_get_or_create_delayed_node(inode);
>   if (IS_ERR(delayed_node))
>   return PTR_ERR(delayed_node);
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Revert "Btrfs: don't delay inode ref updates during log, replay"

2016-12-02 Thread Jeff Mahoney

Whoops, the [PATCH] line should've specified more clearly:  This only
applies to linux-stable, 3.12.y.

Sorry for any confusion.

-Jeff

On 12/2/16 10:21 PM, Jeff Mahoney wrote:
> This reverts commit 644d10716875b24388680925d6c7502420987bfe.
> 
> The original patch for mainline, 6f8960541b1 (Btrfs: don't delay
> inode ref updates during log replay) lists 1d52c78afbb (Btrfs: try
> not to ENOSPC on log replay) as the only pre-3.18 dependency, but it
> also depends on 67de11769bd (Btrfs: introduce the delayed inode ref
> deletion for the single link inode), which was introduced in 3.14
> and isn't in 3.12.y.
> 
> The -stable commit added the check to btrfs_delayed_update_inode,
> which may look similar to btrfs_delayed_delete_inode_ref, but it's
> only superficial.  The tops of both functions handle typical
> delayed node boilerplate.  The upshot is that the patch is harmless
> since the caller already checks to see if we're doing log recovery,
> so we're not breaking anything.  It should be reverted because it
> makes it appear as if this issue was fixed for users who did
> backport 67de11769bd, when it is not.
> 
> Signed-off-by: Jeff Mahoney 
> ---
>  fs/btrfs/delayed-inode.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index 34f33e1..269ac79 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -1805,14 +1805,6 @@ int btrfs_delayed_update_inode(struct 
> btrfs_trans_handle *trans,
>   struct btrfs_delayed_node *delayed_node;
>   int ret = 0;
>  
> - /*
> -  * we don't do delayed inode updates during log recovery because it
> -  * leads to enospc problems.  This means we also can't do
> -  * delayed inode refs
> -  */
> - if (BTRFS_I(inode)->root->fs_info->log_root_recovering)
> - return -EAGAIN;
> -
>   delayed_node = btrfs_get_or_create_delayed_node(inode);
>   if (IS_ERR(delayed_node))
>   return PTR_ERR(delayed_node);
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

[PATCH] Revert "Btrfs: don't delay inode ref updates during log, replay"

2016-12-02 Thread Jeff Mahoney

This reverts commit 644d10716875b24388680925d6c7502420987bfe.

The original patch for mainline, 6f8960541b1 (Btrfs: don't delay
inode ref updates during log replay) lists 1d52c78afbb (Btrfs: try
not to ENOSPC on log replay) as the only pre-3.18 dependency, but it
also depends on 67de11769bd (Btrfs: introduce the delayed inode ref
deletion for the single link inode), which was introduced in 3.14
and isn't in 3.12.y.

The -stable commit added the check to btrfs_delayed_update_inode,
which may look similar to btrfs_delayed_delete_inode_ref, but it's
only superficial.  The tops of both functions handle typical
delayed node boilerplate.  The upshot is that the patch is harmless
since the caller already checks to see if we're doing log recovery,
so we're not breaking anything.  It should be reverted because it
makes it appear as if this issue was fixed for users who did
backport 67de11769bd, when it is not.

Signed-off-by: Jeff Mahoney <je...@suse.com>
---
 fs/btrfs/delayed-inode.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 34f33e1..269ac79 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1805,14 +1805,6 @@ int btrfs_delayed_update_inode(struct btrfs_trans_handle 
*trans,
struct btrfs_delayed_node *delayed_node;
int ret = 0;
 
-   /*
-* we don't do delayed inode updates during log recovery because it
-* leads to enospc problems.  This means we also can't do
-* delayed inode refs
-*/
-   if (BTRFS_I(inode)->root->fs_info->log_root_recovering)
-   return -EAGAIN;
-
delayed_node = btrfs_get_or_create_delayed_node(inode);
if (IS_ERR(delayed_node))
return PTR_ERR(delayed_node);
-- 
2.7.1


-- 
Jeff Mahoney
SUSE Labs

[PATCH] Revert "Btrfs: don't delay inode ref updates during log, replay"

2016-12-02 Thread Jeff Mahoney

This reverts commit 644d10716875b24388680925d6c7502420987bfe.

The original patch for mainline, 6f8960541b1 (Btrfs: don't delay
inode ref updates during log replay) lists 1d52c78afbb (Btrfs: try
not to ENOSPC on log replay) as the only pre-3.18 dependency, but it
also depends on 67de11769bd (Btrfs: introduce the delayed inode ref
deletion for the single link inode), which was introduced in 3.14
and isn't in 3.12.y.

The -stable commit added the check to btrfs_delayed_update_inode,
which may look similar to btrfs_delayed_delete_inode_ref, but it's
only superficial.  The tops of both functions handle typical
delayed node boilerplate.  The upshot is that the patch is harmless
since the caller already checks to see if we're doing log recovery,
so we're not breaking anything.  It should be reverted because it
makes it appear as if this issue was fixed for users who did
backport 67de11769bd, when it is not.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/delayed-inode.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 34f33e1..269ac79 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1805,14 +1805,6 @@ int btrfs_delayed_update_inode(struct btrfs_trans_handle 
*trans,
struct btrfs_delayed_node *delayed_node;
int ret = 0;
 
-   /*
-* we don't do delayed inode updates during log recovery because it
-* leads to enospc problems.  This means we also can't do
-* delayed inode refs
-*/
-   if (BTRFS_I(inode)->root->fs_info->log_root_recovering)
-   return -EAGAIN;
-
delayed_node = btrfs_get_or_create_delayed_node(inode);
if (IS_ERR(delayed_node))
return PTR_ERR(delayed_node);
-- 
2.7.1


-- 
Jeff Mahoney
SUSE Labs

Re: Fs: Btrfs - Fix possible ERR_PTR() dereferencing.

2016-09-20 Thread Jeff Mahoney

On 9/20/16 2:48 AM, Shailendra Verma wrote:
> This is of course wrong to call kfree() if memdup_user() fails,
> no memory was allocated and the error in the error-valued pointer
> should be returned.
> 
> Reviewed-by: Ravikant Sharma <ravikant...@samsung.com>
> Signed-off-by: Shailendra Verma <shailendr...@samsung.com>

Hi Shailendra -

In all three cases, the return value is set using the error-valued
pointer and the pointer is set to NULL.  kfree() checks to see if the
pointer is NULL and, if so, does nothing.  This allows us to use a
common exit path which is an extremely common pattern in the kernel.  So
there's never any possible ERR_PTR dereferencing happening.

However, in all three cases, the allocation you're checking is the first
in each routine and there's no additional cleanup to do.  So your patch
is an improvement, but it's an improvement in code readability instead
of a bug fix.  I'd ask that you re-submit with a commit message that
reflects that.

Thanks,

-Jeff

> ---
>  fs/btrfs/ioctl.c | 21 ++---
>  1 file changed, 6 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index da94138..58a40f8 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -4512,11 +4512,8 @@ static long btrfs_ioctl_logical_to_ino(struct 
> btrfs_root *root,
>   return -EPERM;
>  
>   loi = memdup_user(arg, sizeof(*loi));
> - if (IS_ERR(loi)) {
> - ret = PTR_ERR(loi);
> - loi = NULL;
> - goto out;
> - }
> + if (IS_ERR(loi))
> + return PTR_ERR(loi);
>  
>   path = btrfs_alloc_path();
>   if (!path) {


> @@ -5143,11 +5140,8 @@ static long btrfs_ioctl_set_received_subvol_32(struct 
> file *file,
>   int ret = 0;
>  
>   args32 = memdup_user(arg, sizeof(*args32));
> - if (IS_ERR(args32)) {
> - ret = PTR_ERR(args32);
> - args32 = NULL;
> - goto out;
> - }
> + if (IS_ERR(args32))
> + return PTR_ERR(args32);
>  
>   args64 = kmalloc(sizeof(*args64), GFP_NOFS);
>   if (!args64) {
> @@ -5195,11 +5189,8 @@ static long btrfs_ioctl_set_received_subvol(struct 
> file *file,
>   int ret = 0;
>  
>   sa = memdup_user(arg, sizeof(*sa));
> - if (IS_ERR(sa)) {
> - ret = PTR_ERR(sa);
> - sa = NULL;
> -     goto out;
> - }
> + if (IS_ERR(sa))
> + return PTR_ERR(sa);
>  
>   ret = _btrfs_ioctl_set_received_subvol(file, sa);
>  
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: Fs: Btrfs - Fix possible ERR_PTR() dereferencing.

2016-09-20 Thread Jeff Mahoney

On 9/20/16 2:48 AM, Shailendra Verma wrote:
> This is of course wrong to call kfree() if memdup_user() fails,
> no memory was allocated and the error in the error-valued pointer
> should be returned.
> 
> Reviewed-by: Ravikant Sharma 
> Signed-off-by: Shailendra Verma 

Hi Shailendra -

In all three cases, the return value is set using the error-valued
pointer and the pointer is set to NULL.  kfree() checks to see if the
pointer is NULL and, if so, does nothing.  This allows us to use a
common exit path which is an extremely common pattern in the kernel.  So
there's never any possible ERR_PTR dereferencing happening.

However, in all three cases, the allocation you're checking is the first
in each routine and there's no additional cleanup to do.  So your patch
is an improvement, but it's an improvement in code readability instead
of a bug fix.  I'd ask that you re-submit with a commit message that
reflects that.

Thanks,

-Jeff

> ---
>  fs/btrfs/ioctl.c | 21 ++---
>  1 file changed, 6 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index da94138..58a40f8 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -4512,11 +4512,8 @@ static long btrfs_ioctl_logical_to_ino(struct 
> btrfs_root *root,
>   return -EPERM;
>  
>   loi = memdup_user(arg, sizeof(*loi));
> - if (IS_ERR(loi)) {
> - ret = PTR_ERR(loi);
> - loi = NULL;
> - goto out;
> - }
> + if (IS_ERR(loi))
> + return PTR_ERR(loi);
>  
>   path = btrfs_alloc_path();
>   if (!path) {


> @@ -5143,11 +5140,8 @@ static long btrfs_ioctl_set_received_subvol_32(struct 
> file *file,
>   int ret = 0;
>  
>   args32 = memdup_user(arg, sizeof(*args32));
> - if (IS_ERR(args32)) {
> - ret = PTR_ERR(args32);
> - args32 = NULL;
> - goto out;
> - }
> + if (IS_ERR(args32))
> + return PTR_ERR(args32);
>  
>   args64 = kmalloc(sizeof(*args64), GFP_NOFS);
>   if (!args64) {
> @@ -5195,11 +5189,8 @@ static long btrfs_ioctl_set_received_subvol(struct 
> file *file,
>   int ret = 0;
>  
>   sa = memdup_user(arg, sizeof(*sa));
> - if (IS_ERR(sa)) {
> - ret = PTR_ERR(sa);
> - sa = NULL;
> - goto out;
> - }
> + if (IS_ERR(sa))
> + return PTR_ERR(sa);
>  
>   ret = _btrfs_ioctl_set_received_subvol(file, sa);
>  
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-10 Thread Jeff Mahoney

On 9/10/16 11:58 AM, Sean Fu wrote:
> On Thu, Sep 08, 2016 at 11:25:48PM -0400, Jeff Mahoney wrote:
>> On 9/8/16 11:08 PM, Sean Fu wrote:
>>> On Tue, Sep 06, 2016 at 11:12:20AM -0400, Jeff Mahoney wrote:
>>>> On 9/6/16 5:58 AM, David Sterba wrote:
>>>>> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>>>>>>>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>>>>>>>> directly?
>>>>>>>
>>>>>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>>>>>> to go back and check what else is missing.
>>>>>>
>>>>>> Actually, most of this didn't land.  Pretty much anything that's a root
>>>>>> ->fs_info conversion is in there.
>>>>>
>>>>> Only half of the patchset has been merged so far because it did not pass
>>>>> testing, so I bisected to some point. I was about to let you know once
>>>>> most of 4.9 patches are prepared so there are less merge conflicts.
>>>>
>>>> Ok, thanks.  I was going to start the rebase today but I'll hold off
>>>> until you're set for 4.9.
>>>>
>>> Hi Jeff, Could you please share your patch? Where can i get it?
>>> I wanna have a look at it.
>>
>> Sure, it's the whole series that starts with this commit:
>> commit 160ceedfd40085cfb1e08305917fcc24cefdad93
>> Author: Jeff Mahoney <je...@suse.com>
>> Date:   Wed Aug 31 23:55:33 2016 -0400
>>
>> btrfs: add dynamic debug support
>>
>> ... I still need to do clean up some commits that need merging.
>>
>> https://git.kernel.org/cgit/linux/kernel/git/jeffm/linux-btrfs.git/log/?h=btrfs-testing/kdave/misc-4.9/root-fsinfo-cleanup
>>
> Nice work. Thanks for your explaination.
> I have one more question about it.
> Although the total text size of this function is not changed, using
> fs_info should need one more instruction to get chunk_root field of
> fs_info than using chunk_root directly.
> Does it cause any performance impact?

I doubt it.  The caller had to do the dereference before.

-Jeff

> Sean
>> -Jeff
>>
>>
>>> Thanks
>>>> -Jeff
>>>>
>>>> -- 
>>>> Jeff Mahoney
>>>> SUSE Labs
>>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> -- 
>> Jeff Mahoney
>> SUSE Labs
>>
> 
> 
> 
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-10 Thread Jeff Mahoney

On 9/10/16 11:58 AM, Sean Fu wrote:
> On Thu, Sep 08, 2016 at 11:25:48PM -0400, Jeff Mahoney wrote:
>> On 9/8/16 11:08 PM, Sean Fu wrote:
>>> On Tue, Sep 06, 2016 at 11:12:20AM -0400, Jeff Mahoney wrote:
>>>> On 9/6/16 5:58 AM, David Sterba wrote:
>>>>> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>>>>>>>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>>>>>>>> directly?
>>>>>>>
>>>>>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>>>>>> to go back and check what else is missing.
>>>>>>
>>>>>> Actually, most of this didn't land.  Pretty much anything that's a root
>>>>>> ->fs_info conversion is in there.
>>>>>
>>>>> Only half of the patchset has been merged so far because it did not pass
>>>>> testing, so I bisected to some point. I was about to let you know once
>>>>> most of 4.9 patches are prepared so there are less merge conflicts.
>>>>
>>>> Ok, thanks.  I was going to start the rebase today but I'll hold off
>>>> until you're set for 4.9.
>>>>
>>> Hi Jeff, Could you please share your patch? Where can i get it?
>>> I wanna have a look at it.
>>
>> Sure, it's the whole series that starts with this commit:
>> commit 160ceedfd40085cfb1e08305917fcc24cefdad93
>> Author: Jeff Mahoney 
>> Date:   Wed Aug 31 23:55:33 2016 -0400
>>
>> btrfs: add dynamic debug support
>>
>> ... I still need to do clean up some commits that need merging.
>>
>> https://git.kernel.org/cgit/linux/kernel/git/jeffm/linux-btrfs.git/log/?h=btrfs-testing/kdave/misc-4.9/root-fsinfo-cleanup
>>
> Nice work. Thanks for your explaination.
> I have one more question about it.
> Although the total text size of this function is not changed, using
> fs_info should need one more instruction to get chunk_root field of
> fs_info than using chunk_root directly.
> Does it cause any performance impact?

I doubt it.  The caller had to do the dereference before.

-Jeff

> Sean
>> -Jeff
>>
>>
>>> Thanks
>>>> -Jeff
>>>>
>>>> -- 
>>>> Jeff Mahoney
>>>> SUSE Labs
>>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> -- 
>> Jeff Mahoney
>> SUSE Labs
>>
> 
> 
> 
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-08 Thread Jeff Mahoney

On 9/8/16 11:08 PM, Sean Fu wrote:
> On Tue, Sep 06, 2016 at 11:12:20AM -0400, Jeff Mahoney wrote:
>> On 9/6/16 5:58 AM, David Sterba wrote:
>>> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>>>>>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>>>>>> directly?
>>>>>
>>>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>>>> to go back and check what else is missing.
>>>>
>>>> Actually, most of this didn't land.  Pretty much anything that's a root
>>>> ->fs_info conversion is in there.
>>>
>>> Only half of the patchset has been merged so far because it did not pass
>>> testing, so I bisected to some point. I was about to let you know once
>>> most of 4.9 patches are prepared so there are less merge conflicts.
>>
>> Ok, thanks.  I was going to start the rebase today but I'll hold off
>> until you're set for 4.9.
>>
> Hi Jeff, Could you please share your patch? Where can i get it?
> I wanna have a look at it.

Sure, it's the whole series that starts with this commit:
commit 160ceedfd40085cfb1e08305917fcc24cefdad93
Author: Jeff Mahoney <je...@suse.com>
Date:   Wed Aug 31 23:55:33 2016 -0400

btrfs: add dynamic debug support

... I still need to do clean up some commits that need merging.

https://git.kernel.org/cgit/linux/kernel/git/jeffm/linux-btrfs.git/log/?h=btrfs-testing/kdave/misc-4.9/root-fsinfo-cleanup

-Jeff


> Thanks
>> -Jeff
>>
>> -- 
>> Jeff Mahoney
>> SUSE Labs
>>
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-08 Thread Jeff Mahoney

On 9/8/16 11:08 PM, Sean Fu wrote:
> On Tue, Sep 06, 2016 at 11:12:20AM -0400, Jeff Mahoney wrote:
>> On 9/6/16 5:58 AM, David Sterba wrote:
>>> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>>>>>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>>>>>> directly?
>>>>>
>>>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>>>> to go back and check what else is missing.
>>>>
>>>> Actually, most of this didn't land.  Pretty much anything that's a root
>>>> ->fs_info conversion is in there.
>>>
>>> Only half of the patchset has been merged so far because it did not pass
>>> testing, so I bisected to some point. I was about to let you know once
>>> most of 4.9 patches are prepared so there are less merge conflicts.
>>
>> Ok, thanks.  I was going to start the rebase today but I'll hold off
>> until you're set for 4.9.
>>
> Hi Jeff, Could you please share your patch? Where can i get it?
> I wanna have a look at it.

Sure, it's the whole series that starts with this commit:
commit 160ceedfd40085cfb1e08305917fcc24cefdad93
Author: Jeff Mahoney 
Date:   Wed Aug 31 23:55:33 2016 -0400

btrfs: add dynamic debug support

... I still need to do clean up some commits that need merging.

https://git.kernel.org/cgit/linux/kernel/git/jeffm/linux-btrfs.git/log/?h=btrfs-testing/kdave/misc-4.9/root-fsinfo-cleanup

-Jeff


> Thanks
>> -Jeff
>>
>> -- 
>> Jeff Mahoney
>> SUSE Labs
>>
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Jeff Mahoney

On 9/6/16 5:58 AM, David Sterba wrote:
> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>>>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>>>> directly?
>>>
>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>> to go back and check what else is missing.
>>
>> Actually, most of this didn't land.  Pretty much anything that's a root
>> ->fs_info conversion is in there.
> 
> Only half of the patchset has been merged so far because it did not pass
> testing, so I bisected to some point. I was about to let you know once
> most of 4.9 patches are prepared so there are less merge conflicts.

Ok, thanks.  I was going to start the rebase today but I'll hold off
until you're set for 4.9.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Jeff Mahoney

On 9/6/16 5:58 AM, David Sterba wrote:
> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>>>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>>>> directly?
>>>
>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>> to go back and check what else is missing.
>>
>> Actually, most of this didn't land.  Pretty much anything that's a root
>> ->fs_info conversion is in there.
> 
> Only half of the patchset has been merged so far because it did not pass
> testing, so I bisected to some point. I was about to let you know once
> most of 4.9 patches are prepared so there are less merge conflicts.

Ok, thanks.  I was going to start the rebase today but I'll hold off
until you're set for 4.9.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-05 Thread Jeff Mahoney

On 9/5/16 11:05 PM, Jeff Mahoney wrote:
> On 9/5/16 3:56 AM, Qu Wenruo wrote:
>>
>>
>> At 09/05/2016 09:19 AM, Zhao Lei wrote:
>>> Hi, Sean Fu
>>>
>>>> From: Sean Fu [mailto:fxinr...@gmail.com]
>>>> Sent: Sunday, September 04, 2016 7:54 PM
>>>> To: dste...@suse.com
>>>> Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
>>>> zhao...@cn.fujitsu.com; linux-bt...@vger.kernel.org;
>>>> linux-kernel@vger.kernel.org; Sean Fu <fxinr...@gmail.com>
>>>> Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root
>>>> assignment in
>>>> btrfs_read_chunk_tree.
>>>>
>>>> The input argument root is already set with "fs_info->chunk_root".
>>>> "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
>>>> "open_ctree".
>>>> “root->fs_info = fs_info” in "btrfs_alloc_root".
>>>>
>>> The root argument of this function means "any root".
>>> And the function is designed getting chunk root from
>>> "any root" in head.
>>>
>>> Since there is only one caller of this function,
>>> and the caller always send chunk_root as root argument in
>>> current code, we can remove above conversion,
>>> and I suggest renaming root to chunk_root to make it clear,
>>> something like:
>>>
>>> - btrfs_read_chunk_tree(struct btrfs_root *root)
>>> + btrfs_read_chunk_tree(struct btrfs_root *chunk_root)
>>
>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>> directly?
> 
> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
> to go back and check what else is missing.

Actually, most of this didn't land.  Pretty much anything that's a root
->fs_info conversion is in there.

-Jeff

> -Jeff
> 
> 
>> Thanks,
>> Qu
>>
>>>
>>> Thanks
>>> Zhaolei
>>>
>>>> Signed-off-by: Sean Fu <fxinr...@gmail.com>
>>>> ---
>>>>  fs/btrfs/volumes.c | 2 --
>>>>  1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>> index 366b335..384a6d2 100644
>>>> --- a/fs/btrfs/volumes.c
>>>> +++ b/fs/btrfs/volumes.c
>>>> @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
>>>>  int ret;
>>>>      int slot;
>>>>
>>>> -root = root->fs_info->chunk_root;
>>>> -
>>>>  path = btrfs_alloc_path();
>>>>  if (!path)
>>>>  return -ENOMEM;
>>>> -- 
>>>> 2.6.2
>>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-05 Thread Jeff Mahoney

On 9/5/16 11:05 PM, Jeff Mahoney wrote:
> On 9/5/16 3:56 AM, Qu Wenruo wrote:
>>
>>
>> At 09/05/2016 09:19 AM, Zhao Lei wrote:
>>> Hi, Sean Fu
>>>
>>>> From: Sean Fu [mailto:fxinr...@gmail.com]
>>>> Sent: Sunday, September 04, 2016 7:54 PM
>>>> To: dste...@suse.com
>>>> Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
>>>> zhao...@cn.fujitsu.com; linux-bt...@vger.kernel.org;
>>>> linux-kernel@vger.kernel.org; Sean Fu 
>>>> Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root
>>>> assignment in
>>>> btrfs_read_chunk_tree.
>>>>
>>>> The input argument root is already set with "fs_info->chunk_root".
>>>> "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
>>>> "open_ctree".
>>>> “root->fs_info = fs_info” in "btrfs_alloc_root".
>>>>
>>> The root argument of this function means "any root".
>>> And the function is designed getting chunk root from
>>> "any root" in head.
>>>
>>> Since there is only one caller of this function,
>>> and the caller always send chunk_root as root argument in
>>> current code, we can remove above conversion,
>>> and I suggest renaming root to chunk_root to make it clear,
>>> something like:
>>>
>>> - btrfs_read_chunk_tree(struct btrfs_root *root)
>>> + btrfs_read_chunk_tree(struct btrfs_root *chunk_root)
>>
>> Since root is only used to get fs_info->chunk_root, why not use fs_info
>> directly?
> 
> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
> to go back and check what else is missing.

Actually, most of this didn't land.  Pretty much anything that's a root
->fs_info conversion is in there.

-Jeff

> -Jeff
> 
> 
>> Thanks,
>> Qu
>>
>>>
>>> Thanks
>>> Zhaolei
>>>
>>>> Signed-off-by: Sean Fu 
>>>> ---
>>>>  fs/btrfs/volumes.c | 2 --
>>>>  1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>> index 366b335..384a6d2 100644
>>>> --- a/fs/btrfs/volumes.c
>>>> +++ b/fs/btrfs/volumes.c
>>>> @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
>>>>  int ret;
>>>>  int slot;
>>>>
>>>> -root = root->fs_info->chunk_root;
>>>> -
>>>>  path = btrfs_alloc_path();
>>>>  if (!path)
>>>>  return -ENOMEM;
>>>> -- 
>>>> 2.6.2
>>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-05 Thread Jeff Mahoney

On 9/5/16 3:56 AM, Qu Wenruo wrote:
> 
> 
> At 09/05/2016 09:19 AM, Zhao Lei wrote:
>> Hi, Sean Fu
>>
>>> From: Sean Fu [mailto:fxinr...@gmail.com]
>>> Sent: Sunday, September 04, 2016 7:54 PM
>>> To: dste...@suse.com
>>> Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
>>> zhao...@cn.fujitsu.com; linux-bt...@vger.kernel.org;
>>> linux-kernel@vger.kernel.org; Sean Fu <fxinr...@gmail.com>
>>> Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root
>>> assignment in
>>> btrfs_read_chunk_tree.
>>>
>>> The input argument root is already set with "fs_info->chunk_root".
>>> "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
>>> "open_ctree".
>>> “root->fs_info = fs_info” in "btrfs_alloc_root".
>>>
>> The root argument of this function means "any root".
>> And the function is designed getting chunk root from
>> "any root" in head.
>>
>> Since there is only one caller of this function,
>> and the caller always send chunk_root as root argument in
>> current code, we can remove above conversion,
>> and I suggest renaming root to chunk_root to make it clear,
>> something like:
>>
>> - btrfs_read_chunk_tree(struct btrfs_root *root)
>> + btrfs_read_chunk_tree(struct btrfs_root *chunk_root)
> 
> Since root is only used to get fs_info->chunk_root, why not use fs_info
> directly?

Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
to go back and check what else is missing.

-Jeff


> Thanks,
> Qu
> 
>>
>> Thanks
>> Zhaolei
>>
>>> Signed-off-by: Sean Fu <fxinr...@gmail.com>
>>> ---
>>>  fs/btrfs/volumes.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 366b335..384a6d2 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
>>>  int ret;
>>>  int slot;
>>>
>>> -root = root->fs_info->chunk_root;
>>> -
>>>  path = btrfs_alloc_path();
>>>  if (!path)
>>>  return -ENOMEM;
>>> -- 
>>> 2.6.2
>>>
>>
>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-05 Thread Jeff Mahoney

On 9/5/16 3:56 AM, Qu Wenruo wrote:
> 
> 
> At 09/05/2016 09:19 AM, Zhao Lei wrote:
>> Hi, Sean Fu
>>
>>> From: Sean Fu [mailto:fxinr...@gmail.com]
>>> Sent: Sunday, September 04, 2016 7:54 PM
>>> To: dste...@suse.com
>>> Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
>>> zhao...@cn.fujitsu.com; linux-bt...@vger.kernel.org;
>>> linux-kernel@vger.kernel.org; Sean Fu 
>>> Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root
>>> assignment in
>>> btrfs_read_chunk_tree.
>>>
>>> The input argument root is already set with "fs_info->chunk_root".
>>> "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
>>> "open_ctree".
>>> “root->fs_info = fs_info” in "btrfs_alloc_root".
>>>
>> The root argument of this function means "any root".
>> And the function is designed getting chunk root from
>> "any root" in head.
>>
>> Since there is only one caller of this function,
>> and the caller always send chunk_root as root argument in
>> current code, we can remove above conversion,
>> and I suggest renaming root to chunk_root to make it clear,
>> something like:
>>
>> - btrfs_read_chunk_tree(struct btrfs_root *root)
>> + btrfs_read_chunk_tree(struct btrfs_root *chunk_root)
> 
> Since root is only used to get fs_info->chunk_root, why not use fs_info
> directly?

Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
to go back and check what else is missing.

-Jeff


> Thanks,
> Qu
> 
>>
>> Thanks
>> Zhaolei
>>
>>> Signed-off-by: Sean Fu 
>>> ---
>>>  fs/btrfs/volumes.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 366b335..384a6d2 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
>>>  int ret;
>>>  int slot;
>>>
>>> -root = root->fs_info->chunk_root;
>>> -
>>>  path = btrfs_alloc_path();
>>>  if (!path)
>>>  return -ENOMEM;
>>> -- 
>>> 2.6.2
>>>
>>
>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 13/17] staging: lustre: lloop: Fix build failure on ppc64

2016-04-10 Thread Jeff Mahoney

On 4/10/16 10:04 AM, James Simmons wrote:
> 
>> This patch was shown not to work. I just haven't removed it from opensuse 
>> yet.
> 
> Its been running in our production tree as well for some time. Guess that 
> change is a noop. In any case we have been discussing redoing the lloop 
> driver anyways. Just need to find the cycles.

I guess my memory was flakey and I was recalling the first comments in
LU-4000.  The updated version should be ok.

-Jeff

>> --
>> Jeff Mahoney
>> (apologies for the top post -- from my mobile)
>>
>>> On Apr 10, 2016, at 9:13 AM, James Simmons <jsimm...@infradead.org> wrote:
>>>
>>> From: Jeff Mahoney <je...@suse.com>
>>>
>>> On ppc64 with 64k pages, we get a build failure in lloop:
>>>
>>> drivers/staging/lustre/lustre/llite/lloop.c:527:2:
>>> note: in expansion of macro 'CLASSERT'
>>> CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
>>>
>>> There's no need to change the queue's logical block size. Even if it could
>>> accept a 64k value, that would result in any file system on top of it
>>> needing to also use 64k blocks. It'd be safe to set it to 4k, but there's
>>> no actual need for it. It's not used to split requests except for 
>>> WRITE_SAME,
>>> which lloop doesn't implement anyway.
>>>
>>> Signed-off-by: Jeff Mahoney <je...@suse.com>
>>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4000
>>> Reviewed-on: http://review.whamcloud.com/7745
>>> Reviewed-by: Jinshan Xiong <jinshan.xi...@intel.com>
>>> Reviewed-by: Minh Diep <minh.d...@intel.com>
>>> Reviewed-by: Oleg Drokin <oleg.dro...@intel.com>
>>> Signed-off-by: James Simmons <jsimm...@infradead.org>
>>> ---
>>> drivers/staging/lustre/lustre/llite/lloop.c |3 ---
>>> 1 files changed, 0 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/staging/lustre/lustre/llite/lloop.c 
>>> b/drivers/staging/lustre/lustre/llite/lloop.c
>>> index b725fc1..f396753 100644
>>> --- a/drivers/staging/lustre/lustre/llite/lloop.c
>>> +++ b/drivers/staging/lustre/lustre/llite/lloop.c
>>> @@ -525,9 +525,6 @@ static int loop_set_fd(struct lloop_device *lo, struct 
>>> file *unused,
>>>lo->lo_queue->queuedata = lo;
>>>
>>>/* queue parameters */
>>> -CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
>>> -blk_queue_logical_block_size(lo->lo_queue,
>>> - (unsigned short)PAGE_CACHE_SIZE);
>>>blk_queue_max_hw_sectors(lo->lo_queue,
>>> LLOOP_MAX_SEGMENTS << (PAGE_CACHE_SHIFT - 9));
>>>blk_queue_max_segments(lo->lo_queue, LLOOP_MAX_SEGMENTS);
>>> -- 
>>> 1.7.1
>>>
>>>
>>
>>
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 13/17] staging: lustre: lloop: Fix build failure on ppc64

2016-04-10 Thread Jeff Mahoney

On 4/10/16 10:04 AM, James Simmons wrote:
> 
>> This patch was shown not to work. I just haven't removed it from opensuse 
>> yet.
> 
> Its been running in our production tree as well for some time. Guess that 
> change is a noop. In any case we have been discussing redoing the lloop 
> driver anyways. Just need to find the cycles.

I guess my memory was flakey and I was recalling the first comments in
LU-4000.  The updated version should be ok.

-Jeff

>> --
>> Jeff Mahoney
>> (apologies for the top post -- from my mobile)
>>
>>> On Apr 10, 2016, at 9:13 AM, James Simmons  wrote:
>>>
>>> From: Jeff Mahoney 
>>>
>>> On ppc64 with 64k pages, we get a build failure in lloop:
>>>
>>> drivers/staging/lustre/lustre/llite/lloop.c:527:2:
>>> note: in expansion of macro 'CLASSERT'
>>> CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
>>>
>>> There's no need to change the queue's logical block size. Even if it could
>>> accept a 64k value, that would result in any file system on top of it
>>> needing to also use 64k blocks. It'd be safe to set it to 4k, but there's
>>> no actual need for it. It's not used to split requests except for 
>>> WRITE_SAME,
>>> which lloop doesn't implement anyway.
>>>
>>> Signed-off-by: Jeff Mahoney 
>>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4000
>>> Reviewed-on: http://review.whamcloud.com/7745
>>> Reviewed-by: Jinshan Xiong 
>>> Reviewed-by: Minh Diep 
>>> Reviewed-by: Oleg Drokin 
>>> Signed-off-by: James Simmons 
>>> ---
>>> drivers/staging/lustre/lustre/llite/lloop.c |3 ---
>>> 1 files changed, 0 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/staging/lustre/lustre/llite/lloop.c 
>>> b/drivers/staging/lustre/lustre/llite/lloop.c
>>> index b725fc1..f396753 100644
>>> --- a/drivers/staging/lustre/lustre/llite/lloop.c
>>> +++ b/drivers/staging/lustre/lustre/llite/lloop.c
>>> @@ -525,9 +525,6 @@ static int loop_set_fd(struct lloop_device *lo, struct 
>>> file *unused,
>>>lo->lo_queue->queuedata = lo;
>>>
>>>/* queue parameters */
>>> -CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
>>> -blk_queue_logical_block_size(lo->lo_queue,
>>> - (unsigned short)PAGE_CACHE_SIZE);
>>>blk_queue_max_hw_sectors(lo->lo_queue,
>>> LLOOP_MAX_SEGMENTS << (PAGE_CACHE_SHIFT - 9));
>>>blk_queue_max_segments(lo->lo_queue, LLOOP_MAX_SEGMENTS);
>>> -- 
>>> 1.7.1
>>>
>>>
>>
>>
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 13/17] staging: lustre: lloop: Fix build failure on ppc64

2016-04-10 Thread Jeff Mahoney

This patch was shown not to work. I just haven't removed it from opensuse yet.

-Jeff

--
Jeff Mahoney
(apologies for the top post -- from my mobile)

> On Apr 10, 2016, at 9:13 AM, James Simmons <jsimm...@infradead.org> wrote:
> 
> From: Jeff Mahoney <je...@suse.com>
> 
> On ppc64 with 64k pages, we get a build failure in lloop:
> 
> drivers/staging/lustre/lustre/llite/lloop.c:527:2:
> note: in expansion of macro 'CLASSERT'
> CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
> 
> There's no need to change the queue's logical block size. Even if it could
> accept a 64k value, that would result in any file system on top of it
> needing to also use 64k blocks. It'd be safe to set it to 4k, but there's
> no actual need for it. It's not used to split requests except for WRITE_SAME,
> which lloop doesn't implement anyway.
> 
> Signed-off-by: Jeff Mahoney <je...@suse.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4000
> Reviewed-on: http://review.whamcloud.com/7745
> Reviewed-by: Jinshan Xiong <jinshan.xi...@intel.com>
> Reviewed-by: Minh Diep <minh.d...@intel.com>
> Reviewed-by: Oleg Drokin <oleg.dro...@intel.com>
> Signed-off-by: James Simmons <jsimm...@infradead.org>
> ---
> drivers/staging/lustre/lustre/llite/lloop.c |3 ---
> 1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/lloop.c 
> b/drivers/staging/lustre/lustre/llite/lloop.c
> index b725fc1..f396753 100644
> --- a/drivers/staging/lustre/lustre/llite/lloop.c
> +++ b/drivers/staging/lustre/lustre/llite/lloop.c
> @@ -525,9 +525,6 @@ static int loop_set_fd(struct lloop_device *lo, struct 
> file *unused,
>lo->lo_queue->queuedata = lo;
> 
>/* queue parameters */
> -CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
> -blk_queue_logical_block_size(lo->lo_queue,
> - (unsigned short)PAGE_CACHE_SIZE);
>blk_queue_max_hw_sectors(lo->lo_queue,
> LLOOP_MAX_SEGMENTS << (PAGE_CACHE_SHIFT - 9));
>blk_queue_max_segments(lo->lo_queue, LLOOP_MAX_SEGMENTS);
> -- 
> 1.7.1
> 
>

Re: [PATCH 13/17] staging: lustre: lloop: Fix build failure on ppc64

2016-04-10 Thread Jeff Mahoney

This patch was shown not to work. I just haven't removed it from opensuse yet.

-Jeff

--
Jeff Mahoney
(apologies for the top post -- from my mobile)

> On Apr 10, 2016, at 9:13 AM, James Simmons  wrote:
> 
> From: Jeff Mahoney 
> 
> On ppc64 with 64k pages, we get a build failure in lloop:
> 
> drivers/staging/lustre/lustre/llite/lloop.c:527:2:
> note: in expansion of macro 'CLASSERT'
> CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
> 
> There's no need to change the queue's logical block size. Even if it could
> accept a 64k value, that would result in any file system on top of it
> needing to also use 64k blocks. It'd be safe to set it to 4k, but there's
> no actual need for it. It's not used to split requests except for WRITE_SAME,
> which lloop doesn't implement anyway.
> 
> Signed-off-by: Jeff Mahoney 
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4000
> Reviewed-on: http://review.whamcloud.com/7745
> Reviewed-by: Jinshan Xiong 
> Reviewed-by: Minh Diep 
> Reviewed-by: Oleg Drokin 
> Signed-off-by: James Simmons 
> ---
> drivers/staging/lustre/lustre/llite/lloop.c |3 ---
> 1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/llite/lloop.c 
> b/drivers/staging/lustre/lustre/llite/lloop.c
> index b725fc1..f396753 100644
> --- a/drivers/staging/lustre/lustre/llite/lloop.c
> +++ b/drivers/staging/lustre/lustre/llite/lloop.c
> @@ -525,9 +525,6 @@ static int loop_set_fd(struct lloop_device *lo, struct 
> file *unused,
>lo->lo_queue->queuedata = lo;
> 
>/* queue parameters */
> -CLASSERT(PAGE_CACHE_SIZE < (1 << (sizeof(unsigned short) * 8)));
> -blk_queue_logical_block_size(lo->lo_queue,
> - (unsigned short)PAGE_CACHE_SIZE);
>blk_queue_max_hw_sectors(lo->lo_queue,
> LLOOP_MAX_SEGMENTS << (PAGE_CACHE_SHIFT - 9));
>blk_queue_max_segments(lo->lo_queue, LLOOP_MAX_SEGMENTS);
> -- 
> 1.7.1
> 
>

Re: [PATCHv3 02/13] scripts/gdb: Provide kernel list item generators

2016-03-07 Thread Jeff Mahoney

On 3/3/16 6:40 AM, Kieran Bingham wrote:
> Facilitate linked-list items by providing a generator to return
> the dereferenced, and type-cast objects from a kernel linked list
> 
> CC: Jeff Mahoney <je...@suse.com>
> 
> Signed-off-by: Kieran Bingham <kieran.bing...@linaro.org>
> ---
> Changes since v1:
>  * items function removed, and replaced with Jeff Mahoney's cleaner
>implementations of list_for_each, and list_for_each_entry
> ---
>  scripts/gdb/linux/lists.py | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/scripts/gdb/linux/lists.py b/scripts/gdb/linux/lists.py
> index 3a3775bc162b..9f4503738e26 100644
> --- a/scripts/gdb/linux/lists.py
> +++ b/scripts/gdb/linux/lists.py
> @@ -18,6 +18,26 @@ from linux import utils
>  list_head = utils.CachedType("struct list_head")
>  
>  
> +def list_for_each(head):
> +if head.type == list_head.get_type().pointer():
> +head = head.dereference()
> +elif head.type != list_head.get_type():
> +raise gdb.GdbError("Must be struct list_head not %s" % 
> list_head.type)

Shouldn't this be % head.type?

> +
> +node = head['next'].dereference()
> +while node.address != head.address:
> +yield node.address
> +node = node['next'].dereference()
> +
> +
> +def list_for_each_entry(head, gdbtype, member):
> +for node in list_for_each(head):
> +if node.type != list_head.get_type().pointer():
> +raise TypeError("Type %s found. "
> +"Expected struct list_head *." % node.type)

Nit, but FWIW, I've adopted the kernel style of always keeping strings
on one line so they're easily greppable.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCHv3 02/13] scripts/gdb: Provide kernel list item generators

2016-03-07 Thread Jeff Mahoney

On 3/3/16 6:40 AM, Kieran Bingham wrote:
> Facilitate linked-list items by providing a generator to return
> the dereferenced, and type-cast objects from a kernel linked list
> 
> CC: Jeff Mahoney 
> 
> Signed-off-by: Kieran Bingham 
> ---
> Changes since v1:
>  * items function removed, and replaced with Jeff Mahoney's cleaner
>implementations of list_for_each, and list_for_each_entry
> ---
>  scripts/gdb/linux/lists.py | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/scripts/gdb/linux/lists.py b/scripts/gdb/linux/lists.py
> index 3a3775bc162b..9f4503738e26 100644
> --- a/scripts/gdb/linux/lists.py
> +++ b/scripts/gdb/linux/lists.py
> @@ -18,6 +18,26 @@ from linux import utils
>  list_head = utils.CachedType("struct list_head")
>  
>  
> +def list_for_each(head):
> +if head.type == list_head.get_type().pointer():
> +head = head.dereference()
> +elif head.type != list_head.get_type():
> +raise gdb.GdbError("Must be struct list_head not %s" % 
> list_head.type)

Shouldn't this be % head.type?

> +
> +node = head['next'].dereference()
> +while node.address != head.address:
> +yield node.address
> +node = node['next'].dereference()
> +
> +
> +def list_for_each_entry(head, gdbtype, member):
> +for node in list_for_each(head):
> +if node.type != list_head.get_type().pointer():
> +raise TypeError("Type %s found. "
> +"Expected struct list_head *." % node.type)

Nit, but FWIW, I've adopted the kernel style of always keeping strings
on one line so they're easily greppable.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] btrfs: zero out delayed node upon allocation

2015-10-25 Thread Jeff Mahoney

On Oct 25, 2015, at 3:50 PM, Alexandru Moise <00moses.alexande...@gmail.com> 
wrote:

>>> This allows us to trim out half of btrfs_init_delayed_node() which
>>> is now reduntant.
>> 
>> It's redundant if kmem_cache_zalloc is used, but you haven't
>> documented that doing so is now required.  For all of these changes
>> you've posted, if they're to be accepted, I'd really prefer to set up
>> the slab with a constructor instead.  Then we don't need to worry
>> about such guarantees.  The object returned via kmem_cache_alloc will
>> always be properly initialized.
> 
> Well I wouldn't say it's *required* just makes this particular piece
> of code neater, since we memset-zero the node's inode_item _anyways_.

But the rest of the delayed node still needs to be initialized. That's 
happening implicitly with your patch via zalloc but if anyone ever adds a new 
allocation from that cache, they'll need to know that zalloc is required for 
proper initializatiom. Documenting that is one approach. Constructors are 
another.

> I like the constructor idea though, do you suggest I should invest in
> that idea?

Well, see below.

>> 
>> This makes assumptions about atomic_t and what atomic_set does that
>> aren't guaranteed to be true.  When accessors/mutators are part of the
>> API they should be used.
>> 
>> - -Jeff
> 
> You're right, taking out that atomic_set was really stupid. I'll
> resent the patch with a proper explanation in the commit message and
> put the atomic_set back.
> 
> Unless you feel that the change is rather pointless, I'll gladly back
> off :-).

I don't want to dissuade new contributors but there's plenty of work to be done 
before we start worrying about cleaning initializers. These aren't hot paths 
(but if they were, the implicit memset of the whole object might have negative 
effects.) The immediate impact of cleanup patches like these is code churn so 
that developers with outstanding patch sets need to rebase on top of new 
context. It's a normal part of the development cycle but it's more work for not 
a lot of benefit.

So, if your goal is to contribute to btrfs, I'd suggest digging into the code. 
Try out changes. Break it and figure out how you broke it. It's like learning 
how to navigate in a new city: by getting lost you learn more. :)

-Jeff--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] btrfs: zero out delayed node upon allocation

2015-10-25 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 10/25/15 1:48 PM, Alexandru Moise wrote:
> This allows us to trim out half of btrfs_init_delayed_node() which
> is now reduntant.

It's redundant if kmem_cache_zalloc is used, but you haven't
documented that doing so is now required.  For all of these changes
you've posted, if they're to be accepted, I'd really prefer to set up
the slab with a constructor instead.  Then we don't need to worry
about such guarantees.  The object returned via kmem_cache_alloc will
always be properly initialized.

> Signed-off-by: Alexandru Moise <00moses.alexande...@gmail.com> --- 
> fs/btrfs/delayed-inode.c | 8 +--- 1 file changed, 1
> insertion(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c 
> index a2ae427..7aa84be 100644 --- a/fs/btrfs/delayed-inode.c +++
> b/fs/btrfs/delayed-inode.c @@ -53,17 +53,11 @@ static inline void
> btrfs_init_delayed_node( { delayed_node->root = root; 
> delayed_node->inode_id = inode_id; -
> atomic_set(_node->refs, 0);

This makes assumptions about atomic_t and what atomic_set does that
aren't guaranteed to be true.  When accessors/mutators are part of the
API they should be used.

- -Jeff

> - delayed_node->count = 0; -  delayed_node->flags = 0; 
> delayed_node->ins_root = RB_ROOT; delayed_node->del_root =
> RB_ROOT; mutex_init(_node->mutex); -
> delayed_node->index_cnt = 0; 
> INIT_LIST_HEAD(_node->n_list); 
> INIT_LIST_HEAD(_node->p_list); -
> delayed_node->bytes_reserved = 0; -
> memset(_node->inode_item, 0,
> sizeof(delayed_node->inode_item)); }
> 
> static inline int btrfs_is_continuous_delayed_item( @@ -132,7
> +126,7 @@ again: if (node) return node;
> 
> - node = kmem_cache_alloc(delayed_node_cache, GFP_NOFS); +node =
> kmem_cache_zalloc(delayed_node_cache, GFP_NOFS); if (!node) return
> ERR_PTR(-ENOMEM); btrfs_init_delayed_node(node, root, ino);
> 


- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2

iQIcBAEBCAAGBQJWLP5RAAoJEB57S2MheeWyQMoQAMUNGQFl14cwbJWtSsNjwZJy
eRaSckKcRinc4zoFxATSn4R/UQLadFNg+tAtSnNgoJngpg/nn6Fc19vMS5QVpU1c
OhPgYUVkBTjLxhBaY6Iivinbx4+pjFaEP+e6LoqXefelB6Wq17e8DavQcDU5gH2e
3vzXLt8kKa1hlC5SbtjW59cLgHzrclR6h4qeO00R7vWPA4YyIzOKOnYznrgZ69wS
V97YBmcyucrl+EENYFe5BYBJmP5LojHjsxfPtF2zzZnvCpSrWPZ5Num/2yCofFir
7HNs86wLEBjsEUhkSjZ/7u3XATjVOXNHVUsknOOMB/B6zk9ngnK51mp7a2Ib41PM
nMbNOVHhauumY9sf++RWjN53HTGH8aAlp7B0JIRTceGrvKdT2YGjW8iWwIrtlCE0
bSoEXRNcMYFvR+0we/WKVi4sX0ysRfhDbCuQeWPEJ6R4DbzmtqjCmmHQnZyVpwex
1ZxT8yvFVJvQHEl2E+mHOneRgdVuj89j8AZJ99idIHKa9b60tHAlkq+zXoW0odIt
qV09bYoc7J87W5JvTgKVks13ovwKmM7Ij7zHFMBLVO/BGrd+zFFW1qq8r6GJBKG9
PaXt+HUHBCt4TpvAza5JZ0Yn7tYvALHizj+IZStTw2Kz1vQC5KEnjZIMa3LcKrsx
r2oQLSfGkcsf8EcIzDX1
=qINk
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] btrfs: zero out delayed node upon allocation

2015-10-25 Thread Jeff Mahoney

On Oct 25, 2015, at 3:50 PM, Alexandru Moise <00moses.alexande...@gmail.com> 
wrote:

>>> This allows us to trim out half of btrfs_init_delayed_node() which
>>> is now reduntant.
>> 
>> It's redundant if kmem_cache_zalloc is used, but you haven't
>> documented that doing so is now required.  For all of these changes
>> you've posted, if they're to be accepted, I'd really prefer to set up
>> the slab with a constructor instead.  Then we don't need to worry
>> about such guarantees.  The object returned via kmem_cache_alloc will
>> always be properly initialized.
> 
> Well I wouldn't say it's *required* just makes this particular piece
> of code neater, since we memset-zero the node's inode_item _anyways_.

But the rest of the delayed node still needs to be initialized. That's 
happening implicitly with your patch via zalloc but if anyone ever adds a new 
allocation from that cache, they'll need to know that zalloc is required for 
proper initializatiom. Documenting that is one approach. Constructors are 
another.

> I like the constructor idea though, do you suggest I should invest in
> that idea?

Well, see below.

>> 
>> This makes assumptions about atomic_t and what atomic_set does that
>> aren't guaranteed to be true.  When accessors/mutators are part of the
>> API they should be used.
>> 
>> - -Jeff
> 
> You're right, taking out that atomic_set was really stupid. I'll
> resent the patch with a proper explanation in the commit message and
> put the atomic_set back.
> 
> Unless you feel that the change is rather pointless, I'll gladly back
> off :-).

I don't want to dissuade new contributors but there's plenty of work to be done 
before we start worrying about cleaning initializers. These aren't hot paths 
(but if they were, the implicit memset of the whole object might have negative 
effects.) The immediate impact of cleanup patches like these is code churn so 
that developers with outstanding patch sets need to rebase on top of new 
context. It's a normal part of the development cycle but it's more work for not 
a lot of benefit.

So, if your goal is to contribute to btrfs, I'd suggest digging into the code. 
Try out changes. Break it and figure out how you broke it. It's like learning 
how to navigate in a new city: by getting lost you learn more. :)

-Jeff--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] btrfs: zero out delayed node upon allocation

2015-10-25 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 10/25/15 1:48 PM, Alexandru Moise wrote:
> This allows us to trim out half of btrfs_init_delayed_node() which
> is now reduntant.

It's redundant if kmem_cache_zalloc is used, but you haven't
documented that doing so is now required.  For all of these changes
you've posted, if they're to be accepted, I'd really prefer to set up
the slab with a constructor instead.  Then we don't need to worry
about such guarantees.  The object returned via kmem_cache_alloc will
always be properly initialized.

> Signed-off-by: Alexandru Moise <00moses.alexande...@gmail.com> --- 
> fs/btrfs/delayed-inode.c | 8 +--- 1 file changed, 1
> insertion(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c 
> index a2ae427..7aa84be 100644 --- a/fs/btrfs/delayed-inode.c +++
> b/fs/btrfs/delayed-inode.c @@ -53,17 +53,11 @@ static inline void
> btrfs_init_delayed_node( { delayed_node->root = root; 
> delayed_node->inode_id = inode_id; -
> atomic_set(_node->refs, 0);

This makes assumptions about atomic_t and what atomic_set does that
aren't guaranteed to be true.  When accessors/mutators are part of the
API they should be used.

- -Jeff

> - delayed_node->count = 0; -  delayed_node->flags = 0; 
> delayed_node->ins_root = RB_ROOT; delayed_node->del_root =
> RB_ROOT; mutex_init(_node->mutex); -
> delayed_node->index_cnt = 0; 
> INIT_LIST_HEAD(_node->n_list); 
> INIT_LIST_HEAD(_node->p_list); -
> delayed_node->bytes_reserved = 0; -
> memset(_node->inode_item, 0,
> sizeof(delayed_node->inode_item)); }
> 
> static inline int btrfs_is_continuous_delayed_item( @@ -132,7
> +126,7 @@ again: if (node) return node;
> 
> - node = kmem_cache_alloc(delayed_node_cache, GFP_NOFS); +node =
> kmem_cache_zalloc(delayed_node_cache, GFP_NOFS); if (!node) return
> ERR_PTR(-ENOMEM); btrfs_init_delayed_node(node, root, ino);
> 


- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2

iQIcBAEBCAAGBQJWLP5RAAoJEB57S2MheeWyQMoQAMUNGQFl14cwbJWtSsNjwZJy
eRaSckKcRinc4zoFxATSn4R/UQLadFNg+tAtSnNgoJngpg/nn6Fc19vMS5QVpU1c
OhPgYUVkBTjLxhBaY6Iivinbx4+pjFaEP+e6LoqXefelB6Wq17e8DavQcDU5gH2e
3vzXLt8kKa1hlC5SbtjW59cLgHzrclR6h4qeO00R7vWPA4YyIzOKOnYznrgZ69wS
V97YBmcyucrl+EENYFe5BYBJmP5LojHjsxfPtF2zzZnvCpSrWPZ5Num/2yCofFir
7HNs86wLEBjsEUhkSjZ/7u3XATjVOXNHVUsknOOMB/B6zk9ngnK51mp7a2Ib41PM
nMbNOVHhauumY9sf++RWjN53HTGH8aAlp7B0JIRTceGrvKdT2YGjW8iWwIrtlCE0
bSoEXRNcMYFvR+0we/WKVi4sX0ysRfhDbCuQeWPEJ6R4DbzmtqjCmmHQnZyVpwex
1ZxT8yvFVJvQHEl2E+mHOneRgdVuj89j8AZJ99idIHKa9b60tHAlkq+zXoW0odIt
qV09bYoc7J87W5JvTgKVks13ovwKmM7Ij7zHFMBLVO/BGrd+zFFW1qq8r6GJBKG9
PaXt+HUHBCt4TpvAza5JZ0Yn7tYvALHizj+IZStTw2Kz1vQC5KEnjZIMa3LcKrsx
r2oQLSfGkcsf8EcIzDX1
=qINk
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Ext3 removal, quota & udf fixes

2015-09-01 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 9/1/15 8:52 AM, Eric Sandeen wrote:
> On 8/31/15 5:39 PM, Linus Torvalds wrote:
>> On Mon, Aug 31, 2015 at 3:31 PM, Raymond Jennings
>>  wrote:
>>> 
>>> That said, I wouldn't mind myself if the ext4 driver were given
>>> a very grueling regression test to make sure it can actually
>>> handle old ext3 systems as well as the ext3 driver can.
>> 
>> That's not my only worry. Things like "can you go back to
>> ext3-only" is an issue too - I don't think that's been a big
>> priority for ext4 any more, and if there are any existing
>> hold-outs that still use ext3, they may want to be able to go
>> back to old kernels.
> 
> Others have said it as well, but I'll restate: Yes, you can go
> "back."
> 
> But there's no real "going back" needed, because mounting an ext3 
> filesystem with ext4.ko hasn't "gone anywhere."  If you use
> ext4.ko, you are running with a newer driver, but not a newer disk
> format.
> 
>> So it's not just a "you can use ext4 instead" issue. Can you do
>> that *without* then forcing an upgrade forever on that partition?
>> I'm not sure the ext4 people are really even willing to guarantee
>> that kind of backwards compatibility.
> 
> Yeah, we are, I think I remember making an explicit stink about
> that quite a while ago.  But I'm satisfied; we do use
> CONFIG_EXT4_USE_FOR_EXT23 in RHEL7, and we wouldn't do that if it
> might change disk format behind our users' backs.

As a "me too" datapoint, we do the same thing with SLE12.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJV5cGbAAoJEB57S2MheeWy9z8QALz0EPV2V9MPzSTkQ/DyvcEK
V8GxQyHpOSwVLqU/J8f9+1vM7dbRR9bjXuN8YAZ/bmTf5GQA4opuCZK4KanK3xeo
JSQeFfO9O4gb3aiKdVxxXS3RhejR4WhQ+nQQ3HOIIXCjauU4vd88UH1tjcBR8PdE
DXKtq1KpqIWc5zLr22AgTRwkDiYauaziK2i6922DVl0rSOjfFw4BoV91pgE0pLgi
ojXKosmpST1VOlHKw7l2yOeMdVYUn1wJFgOprCXSLNwckszcqLgNbRYaOaiZmh7Z
c4zWdOkKQDI1/+THtNniF7xiDx2vFgtCSKbaU+hdsshxBHntLIIRj7vqOqfVLb85
m8lyshQDen8y7bahimCfr/hIGyXBuotl+jS4sryABInBf3AkuR3kI9gYJrb6YDWU
jt/0XnT4OIboB1G/Op+SLeNBBR+1RQn9iu559dw2JFQJFeT9JJOvbyaYU5Ijl8I4
eBxebee9laat3atbKVe4HTzGgNzbwCoA8KZuhtaDncR+jd6tsiSLsdlVKZw/P3g9
J35QNlOt6FyecDa9G3OYcpLsxffdv/fke2yT+R6LiN54oVY77OYRjFNY2X27yHWX
76qAOj+tkntfvFV5BjElT+uqXxqebb/kwLt2gUJJ68U/lVVYFHDJKjE2XXdf5omy
/N/PcMenBbsGq6H0UvzS
=G4VO
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Ext3 removal, quota & udf fixes

2015-09-01 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 9/1/15 8:52 AM, Eric Sandeen wrote:
> On 8/31/15 5:39 PM, Linus Torvalds wrote:
>> On Mon, Aug 31, 2015 at 3:31 PM, Raymond Jennings
>> <shent...@gmail.com> wrote:
>>> 
>>> That said, I wouldn't mind myself if the ext4 driver were given
>>> a very grueling regression test to make sure it can actually
>>> handle old ext3 systems as well as the ext3 driver can.
>> 
>> That's not my only worry. Things like "can you go back to
>> ext3-only" is an issue too - I don't think that's been a big
>> priority for ext4 any more, and if there are any existing
>> hold-outs that still use ext3, they may want to be able to go
>> back to old kernels.
> 
> Others have said it as well, but I'll restate: Yes, you can go
> "back."
> 
> But there's no real "going back" needed, because mounting an ext3 
> filesystem with ext4.ko hasn't "gone anywhere."  If you use
> ext4.ko, you are running with a newer driver, but not a newer disk
> format.
> 
>> So it's not just a "you can use ext4 instead" issue. Can you do
>> that *without* then forcing an upgrade forever on that partition?
>> I'm not sure the ext4 people are really even willing to guarantee
>> that kind of backwards compatibility.
> 
> Yeah, we are, I think I remember making an explicit stink about
> that quite a while ago.  But I'm satisfied; we do use
> CONFIG_EXT4_USE_FOR_EXT23 in RHEL7, and we wouldn't do that if it
> might change disk format behind our users' backs.

As a "me too" datapoint, we do the same thing with SLE12.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJV5cGbAAoJEB57S2MheeWy9z8QALz0EPV2V9MPzSTkQ/DyvcEK
V8GxQyHpOSwVLqU/J8f9+1vM7dbRR9bjXuN8YAZ/bmTf5GQA4opuCZK4KanK3xeo
JSQeFfO9O4gb3aiKdVxxXS3RhejR4WhQ+nQQ3HOIIXCjauU4vd88UH1tjcBR8PdE
DXKtq1KpqIWc5zLr22AgTRwkDiYauaziK2i6922DVl0rSOjfFw4BoV91pgE0pLgi
ojXKosmpST1VOlHKw7l2yOeMdVYUn1wJFgOprCXSLNwckszcqLgNbRYaOaiZmh7Z
c4zWdOkKQDI1/+THtNniF7xiDx2vFgtCSKbaU+hdsshxBHntLIIRj7vqOqfVLb85
m8lyshQDen8y7bahimCfr/hIGyXBuotl+jS4sryABInBf3AkuR3kI9gYJrb6YDWU
jt/0XnT4OIboB1G/Op+SLeNBBR+1RQn9iu559dw2JFQJFeT9JJOvbyaYU5Ijl8I4
eBxebee9laat3atbKVe4HTzGgNzbwCoA8KZuhtaDncR+jd6tsiSLsdlVKZw/P3g9
J35QNlOt6FyecDa9G3OYcpLsxffdv/fke2yT+R6LiN54oVY77OYRjFNY2X27yHWX
76qAOj+tkntfvFV5BjElT+uqXxqebb/kwLt2gUJJ68U/lVVYFHDJKjE2XXdf5omy
/N/PcMenBbsGq6H0UvzS
=G4VO
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] reiserfs: fix __RASSERT format string

2015-03-17 Thread Jeff Mahoney

On 3/16/15 9:45 AM, Nicolas Iooss wrote:
> __RASSERT format string does not use the PID argument.
> reiserfs_panic arguments are therefore formatted with the wrong
> format specifier (for example __LINE__ with %s).  This bug was
> introduced when commit c3a9c2109f84 ("reiserfs: rework
> reiserfs_panic") removed a "reiserfs[%i]" prefix.
> 
> This bug is only triggered when using CONFIG_REISERFS_CHECK,
> otherwise __RASSERT is never used.
> 
> Signed-off-by: Nicolas Iooss 
Acked-by: Jeff Mahoney 

> Fixes: c3a9c2109f84 ("reiserfs: rework reiserfs_panic") --- 
> fs/reiserfs/reiserfs.h | 1 - 1 file changed, 1 deletion(-)
> 
> diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h index
> bb79cddf0a1f..2adcde137c3f 100644 --- a/fs/reiserfs/reiserfs.h +++
> b/fs/reiserfs/reiserfs.h @@ -910,7 +910,6 @@ do { 
> \ if
> (!(cond)) \ 
> reiserfs_panic(NULL, "assertion failure", "("
> #cond ") at " \ __FILE__ ":%i:%s: " format "\n",      \ -
> in_interrupt() ? -1 : task_pid_nr(current), \ __LINE__, __func__ ,
> ##args);  \ } while (0)
> 
> 


-- 
Jeff Mahoney
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] reiserfs: fix __RASSERT format string

2015-03-17 Thread Jeff Mahoney

On 3/16/15 9:45 AM, Nicolas Iooss wrote:
 __RASSERT format string does not use the PID argument.
 reiserfs_panic arguments are therefore formatted with the wrong
 format specifier (for example __LINE__ with %s).  This bug was
 introduced when commit c3a9c2109f84 (reiserfs: rework
 reiserfs_panic) removed a reiserfs[%i] prefix.
 
 This bug is only triggered when using CONFIG_REISERFS_CHECK,
 otherwise __RASSERT is never used.
 
 Signed-off-by: Nicolas Iooss nicolas.iooss_li...@m4x.org
Acked-by: Jeff Mahoney je...@suse.com

 Fixes: c3a9c2109f84 (reiserfs: rework reiserfs_panic) --- 
 fs/reiserfs/reiserfs.h | 1 - 1 file changed, 1 deletion(-)
 
 diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h index
 bb79cddf0a1f..2adcde137c3f 100644 --- a/fs/reiserfs/reiserfs.h +++
 b/fs/reiserfs/reiserfs.h @@ -910,7 +910,6 @@ do { 
 \ if
 (!(cond)) \ 
 reiserfs_panic(NULL, assertion failure, (
 #cond ) at  \ __FILE__ :%i:%s:  format \n,  \ -
 in_interrupt() ? -1 : task_pid_nr(current), \ __LINE__, __func__ ,
 ##args);  \ } while (0)
 
 


-- 
Jeff Mahoney
SUSE Labs

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: reiserfs: inconsistent format in __RASSERT

2015-03-16 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 3/16/15 8:55 AM, Nicolas Iooss wrote:
> Hello,
> 
> When adding a __printf attribute to reiserfs_panic, gcc reported
> an inconsistent format for __RASSERT.  This macro is currently
> defined in fs/reiserfs/reiserfs.h as:
> 
> reiserfs_panic(NULL, "assertion failure", "(" #cond ") at " \ 
> __FILE__ ":%i:%s: " format "\n",\ 
> in_interrupt() ? -1 : task_pid_nr(current), \ __LINE__,
> __func__ , ##args);
> 
> In the format string, the first parameter is a line number, but in
> the arguments there is a PID before.  Before c3a9c2109f84
> ("reiserfs: rework reiserfs_panic") [1], the format string began
> with "reiserfs[%i]" [2], which explains the PID in the arguments.
> 
> I see three possibilities:
> 
> * I missed something in my analysis and in fact the PID argument
> is processed by reiserfs_panic (don't know where), or * the PID
> argument is not used and should be removed, or

This, please. reiserfs_panic calls BUG(), which will contain the PID.

> * the PID is useful and "[%i]" should be added somewhere in the
> format string.
> 
> Which one would you prefer?
> 
> Also, I found this when building the kernel with "allmodconfig" on 
> x86_64.  With "defconfig" gcc does not report this error, but I
> guess it is because without CONFIG_REISERFS_CHECK, __RASSERT is
> never used.

Yeah. If reiserfs was more actively maintained, what is currently
protected by CONFIG_REISERFS_CHECK would be handled a bit better.
There are ton of fsfuzzer bugs that would be caught by it and should
be handled using reiserfs_error. Unfortunately, it also enables some
heavy checks that make the file system very slow.

Thanks for looking into this. It looks like it's been broken for a
while. I suppose the only saving grace is that it would crash in a
path that crashes on purpose a few lines later.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJVBtUEAAoJEB57S2MheeWysH4P/RBggjHOwREYHiq2RaY8H+sf
rSRaqf14xePP7vcWOvcQRkzjB2f6wnvD40i7j0vLqP5A6mjp+tdrSgl7P1KqGGBV
45oQuibM5LcrDA07cIgXYLVYZxiWCtOyDxjSfoNw4HsrP/gPIx5YevJseb/VZPON
AH1ywT8LSmKx25jz20f6mmfbSuqtHe+ceitVcyjRnTw6363ngSPKj48rpPpo9uQE
SJygrJy1kkEVw0P9EHSa03jSKggPIpEj40lV5L7BDKkEsqor+3jXZDHaM7qQq+N0
eYaYzIxBbWuf8jAHe/XDGDNo0TEjvFk6qgmdUKjn41j+mS4SbUZGk55QFJO32ecv
GK9a/leQ/YyfPS9HBsuk6g51O1RU34nSyMY/i6o//VncgIJqIaxiWMb0KR5f79uL
LWv/A4TDsFC0/o/O25FFFq2jte5i497aFzxpTI+KDRmzxBUM20QzkhwPz2tySace
X0KBsJoLdgXLZhHYSlm2iydCb4C0lt6M3Q42IUlCeB3DQSViFgHnLry0ALULMcOk
N061Pnv+BoM+yEScF5TEF+/S4QgtgqdxNsUzZTF/070rVgtbR0iimikkf2w2ejWM
nYHsrFXJBZ55PtLKrV2ujVg3e25DlHK2irrawWmUF+/9zA/CV08u73XnB+VjAnTm
8Y0B/t6I90I3urltJbYA
=Pcpd
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: reiserfs: inconsistent format in __RASSERT

2015-03-16 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 3/16/15 8:55 AM, Nicolas Iooss wrote:
 Hello,
 
 When adding a __printf attribute to reiserfs_panic, gcc reported
 an inconsistent format for __RASSERT.  This macro is currently
 defined in fs/reiserfs/reiserfs.h as:
 
 reiserfs_panic(NULL, assertion failure, ( #cond ) at  \ 
 __FILE__ :%i:%s:  format \n,\ 
 in_interrupt() ? -1 : task_pid_nr(current), \ __LINE__,
 __func__ , ##args);
 
 In the format string, the first parameter is a line number, but in
 the arguments there is a PID before.  Before c3a9c2109f84
 (reiserfs: rework reiserfs_panic) [1], the format string began
 with reiserfs[%i] [2], which explains the PID in the arguments.
 
 I see three possibilities:
 
 * I missed something in my analysis and in fact the PID argument
 is processed by reiserfs_panic (don't know where), or * the PID
 argument is not used and should be removed, or

This, please. reiserfs_panic calls BUG(), which will contain the PID.

 * the PID is useful and [%i] should be added somewhere in the
 format string.
 
 Which one would you prefer?
 
 Also, I found this when building the kernel with allmodconfig on 
 x86_64.  With defconfig gcc does not report this error, but I
 guess it is because without CONFIG_REISERFS_CHECK, __RASSERT is
 never used.

Yeah. If reiserfs was more actively maintained, what is currently
protected by CONFIG_REISERFS_CHECK would be handled a bit better.
There are ton of fsfuzzer bugs that would be caught by it and should
be handled using reiserfs_error. Unfortunately, it also enables some
heavy checks that make the file system very slow.

Thanks for looking into this. It looks like it's been broken for a
while. I suppose the only saving grace is that it would crash in a
path that crashes on purpose a few lines later.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJVBtUEAAoJEB57S2MheeWysH4P/RBggjHOwREYHiq2RaY8H+sf
rSRaqf14xePP7vcWOvcQRkzjB2f6wnvD40i7j0vLqP5A6mjp+tdrSgl7P1KqGGBV
45oQuibM5LcrDA07cIgXYLVYZxiWCtOyDxjSfoNw4HsrP/gPIx5YevJseb/VZPON
AH1ywT8LSmKx25jz20f6mmfbSuqtHe+ceitVcyjRnTw6363ngSPKj48rpPpo9uQE
SJygrJy1kkEVw0P9EHSa03jSKggPIpEj40lV5L7BDKkEsqor+3jXZDHaM7qQq+N0
eYaYzIxBbWuf8jAHe/XDGDNo0TEjvFk6qgmdUKjn41j+mS4SbUZGk55QFJO32ecv
GK9a/leQ/YyfPS9HBsuk6g51O1RU34nSyMY/i6o//VncgIJqIaxiWMb0KR5f79uL
LWv/A4TDsFC0/o/O25FFFq2jte5i497aFzxpTI+KDRmzxBUM20QzkhwPz2tySace
X0KBsJoLdgXLZhHYSlm2iydCb4C0lt6M3Q42IUlCeB3DQSViFgHnLry0ALULMcOk
N061Pnv+BoM+yEScF5TEF+/S4QgtgqdxNsUzZTF/070rVgtbR0iimikkf2w2ejWM
nYHsrFXJBZ55PtLKrV2ujVg3e25DlHK2irrawWmUF+/9zA/CV08u73XnB+VjAnTm
8Y0B/t6I90I3urltJbYA
=Pcpd
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] reiserfs: fix several reiserfs_warning calls

2015-03-08 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 3/6/15 5:41 AM, Nicolas Iooss wrote:
> Since commit 45b03d5e8e67 ("reiserfs: rework reiserfs_warning"), 
> reiserfs_warning takes an id and a format as arguments, not a
> single format argument.  However 4 calls still follow the old
> interface. Update them.
> 
> This bug was initially found by adding __printf(4, 5) attribute to 
> __reiserfs_warning and using "gcc -Wformat=2" when building the 
> kernel.
> 
> Fixes: 45b03d5e8e67 ("reiserfs: rework reiserfs_warning") 
> Signed-off-by: Nicolas Iooss 
Acked-by: Jeff Mahoney 

> --- fs/reiserfs/bitmap.c  | 4 ++-- fs/reiserfs/journal.c | 4 ++-- 
> fs/reiserfs/procfs.c  | 4 ++-- 3 files changed, 6 insertions(+), 6
> deletions(-)
> 
> diff --git a/fs/reiserfs/bitmap.c b/fs/reiserfs/bitmap.c index
> dc198bc64c61..1d02f184863d 100644 --- a/fs/reiserfs/bitmap.c +++
> b/fs/reiserfs/bitmap.c @@ -1423,8 +1423,8 @@ struct buffer_head
> *reiserfs_read_bitmap_block(struct super_block *sb,
> 
> bh = sb_bread(sb, block); if (bh == NULL) -   reiserfs_warning(sb,
> "sh-2029: %s: bitmap block (#%u) " -   "reading
> failed", __func__, block); +  reiserfs_warning(sb, "sh-2029", +
> "bitmap block (#%u) reading failed", block); else { if
> (buffer_locked(bh)) { int depth; diff --git a/fs/reiserfs/journal.c
> b/fs/reiserfs/journal.c index 9d6486d416a3..bb08e81ef9ec 100644 ---
> a/fs/reiserfs/journal.c +++ b/fs/reiserfs/journal.c @@ -2643,8
> +2643,8 @@ static int journal_init_dev(struct super_block *super, 
> if (IS_ERR(journal->j_dev_bd)) { result =
> PTR_ERR(journal->j_dev_bd); journal->j_dev_bd = NULL; -
> reiserfs_warning(super, -  "journal_init_dev: 
> Cannot open '%s':
> %i", +reiserfs_warning(super, NULL, + 
>  "Cannot open '%s':
> %i", jdev_name, result); return result; } diff --git
> a/fs/reiserfs/procfs.c b/fs/reiserfs/procfs.c index
> 621b9f381fe1..e6a1b2c34ef6 100644 --- a/fs/reiserfs/procfs.c +++
> b/fs/reiserfs/procfs.c @@ -436,7 +436,7 @@ int
> reiserfs_proc_info_init(struct super_block *sb) add_file(sb,
> "journal", show_journal); return 0; } -   reiserfs_warning(sb,
> "cannot create /proc/%s/%s", +reiserfs_warning(sb, NULL, "cannot
> create /proc/%s/%s", proc_info_root_name, b); return 1; } @@ -465,7
> +465,7 @@ int reiserfs_proc_info_global_init(void) if
> (proc_info_root == NULL) { proc_info_root =
> proc_mkdir(proc_info_root_name, NULL); if (!proc_info_root) { -
> reiserfs_warning(NULL, "cannot create /proc/%s", +
> reiserfs_warning(NULL, NULL, "cannot create /proc/%s", 
> proc_info_root_name); return 1; }
> 


- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)

iQIcBAEBAgAGBQJU/Nv+AAoJEB57S2MheeWyFX0QAIxVQtvUQ2IC7kdaEqtTSmz+
0isl/6OqZa+JA4jSG1GmVbhhY/yS9qCImdEozAyuNF1gJIsUzcrN/lyPzFWnnJDm
LD7DOKvcLp1PnZ7UsqBEjwy8vZ3YXNL9CbVxczij5Dx3z1PXnPGaLi6okf6vT2M1
vqp2ZwKfEvDZzN0cvt2fmkO9k1GQaUhOUkzjnZy1I6MCmPd3dF9hcr93g0/BlTWD
DeVS7Y+gjFP8eA3l2KJWrBbDHK59M3RCOpVdVhIyCu73dcJs94ZZBn8jCXN3kfpv
33xb/MWZE+vnNCo4PkmJCngKZN8gu/OW2nna1c0yzUlKkRMcipHKDB7RDxPAeeXL
2nz3LMv2MP94DCDdpxVjZi2I+2REHWAaxi62IMxEiOl9X7aVuZrn+UH5R/6E5/vp
H39yFVHFaKBLI8pLGdcnObcDNKPr7nFEzwvS4ngiYntcH4F8K9SBun+EgwB6gYLq
xQQXxDdNxfWGZCELz8Q8ZJQkM2seeOu2ImUkaxI+MUycrUKvkwz7ECtOdUQMlbUN
8EqSfiHrP62JZ5XrNdlm/uIvKaFyq8PwUZNXQtl28Op6PrP6CAIpe5/5q+95L18f
TDTMyiVOO52B6C3wLheQnfswN3b2/xwebd84BIUmYPPdwWRbVH6t5GvBFkExMIH9
kqefbLXOQDaBLzhaxIvt
=/bVH
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] reiserfs: fix several reiserfs_warning calls

2015-03-08 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 3/6/15 5:41 AM, Nicolas Iooss wrote:
 Since commit 45b03d5e8e67 (reiserfs: rework reiserfs_warning), 
 reiserfs_warning takes an id and a format as arguments, not a
 single format argument.  However 4 calls still follow the old
 interface. Update them.
 
 This bug was initially found by adding __printf(4, 5) attribute to 
 __reiserfs_warning and using gcc -Wformat=2 when building the 
 kernel.
 
 Fixes: 45b03d5e8e67 (reiserfs: rework reiserfs_warning) 
 Signed-off-by: Nicolas Iooss nicolas.iooss_li...@m4x.org
Acked-by: Jeff Mahoney je...@suse.com

 --- fs/reiserfs/bitmap.c  | 4 ++-- fs/reiserfs/journal.c | 4 ++-- 
 fs/reiserfs/procfs.c  | 4 ++-- 3 files changed, 6 insertions(+), 6
 deletions(-)
 
 diff --git a/fs/reiserfs/bitmap.c b/fs/reiserfs/bitmap.c index
 dc198bc64c61..1d02f184863d 100644 --- a/fs/reiserfs/bitmap.c +++
 b/fs/reiserfs/bitmap.c @@ -1423,8 +1423,8 @@ struct buffer_head
 *reiserfs_read_bitmap_block(struct super_block *sb,
 
 bh = sb_bread(sb, block); if (bh == NULL) -   reiserfs_warning(sb,
 sh-2029: %s: bitmap block (#%u)  -   reading
 failed, __func__, block); +  reiserfs_warning(sb, sh-2029, +
 bitmap block (#%u) reading failed, block); else { if
 (buffer_locked(bh)) { int depth; diff --git a/fs/reiserfs/journal.c
 b/fs/reiserfs/journal.c index 9d6486d416a3..bb08e81ef9ec 100644 ---
 a/fs/reiserfs/journal.c +++ b/fs/reiserfs/journal.c @@ -2643,8
 +2643,8 @@ static int journal_init_dev(struct super_block *super, 
 if (IS_ERR(journal-j_dev_bd)) { result =
 PTR_ERR(journal-j_dev_bd); journal-j_dev_bd = NULL; -
 reiserfs_warning(super, -  journal_init_dev: 
 Cannot open '%s':
 %i, +reiserfs_warning(super, NULL, + 
  Cannot open '%s':
 %i, jdev_name, result); return result; } diff --git
 a/fs/reiserfs/procfs.c b/fs/reiserfs/procfs.c index
 621b9f381fe1..e6a1b2c34ef6 100644 --- a/fs/reiserfs/procfs.c +++
 b/fs/reiserfs/procfs.c @@ -436,7 +436,7 @@ int
 reiserfs_proc_info_init(struct super_block *sb) add_file(sb,
 journal, show_journal); return 0; } -   reiserfs_warning(sb,
 cannot create /proc/%s/%s, +reiserfs_warning(sb, NULL, cannot
 create /proc/%s/%s, proc_info_root_name, b); return 1; } @@ -465,7
 +465,7 @@ int reiserfs_proc_info_global_init(void) if
 (proc_info_root == NULL) { proc_info_root =
 proc_mkdir(proc_info_root_name, NULL); if (!proc_info_root) { -
 reiserfs_warning(NULL, cannot create /proc/%s, +
 reiserfs_warning(NULL, NULL, cannot create /proc/%s, 
 proc_info_root_name); return 1; }
 


- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)

iQIcBAEBAgAGBQJU/Nv+AAoJEB57S2MheeWyFX0QAIxVQtvUQ2IC7kdaEqtTSmz+
0isl/6OqZa+JA4jSG1GmVbhhY/yS9qCImdEozAyuNF1gJIsUzcrN/lyPzFWnnJDm
LD7DOKvcLp1PnZ7UsqBEjwy8vZ3YXNL9CbVxczij5Dx3z1PXnPGaLi6okf6vT2M1
vqp2ZwKfEvDZzN0cvt2fmkO9k1GQaUhOUkzjnZy1I6MCmPd3dF9hcr93g0/BlTWD
DeVS7Y+gjFP8eA3l2KJWrBbDHK59M3RCOpVdVhIyCu73dcJs94ZZBn8jCXN3kfpv
33xb/MWZE+vnNCo4PkmJCngKZN8gu/OW2nna1c0yzUlKkRMcipHKDB7RDxPAeeXL
2nz3LMv2MP94DCDdpxVjZi2I+2REHWAaxi62IMxEiOl9X7aVuZrn+UH5R/6E5/vp
H39yFVHFaKBLI8pLGdcnObcDNKPr7nFEzwvS4ngiYntcH4F8K9SBun+EgwB6gYLq
xQQXxDdNxfWGZCELz8Q8ZJQkM2seeOu2ImUkaxI+MUycrUKvkwz7ECtOdUQMlbUN
8EqSfiHrP62JZ5XrNdlm/uIvKaFyq8PwUZNXQtl28Op6PrP6CAIpe5/5q+95L18f
TDTMyiVOO52B6C3wLheQnfswN3b2/xwebd84BIUmYPPdwWRbVH6t5GvBFkExMIH9
kqefbLXOQDaBLzhaxIvt
=/bVH
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] vdso: don't require 64-bit math in standalone test

2014-10-24 Thread Jeff Mahoney

The use of 64-bit math on i386 causes build failures:
vdso_standalone_test_x86.c:(.text+0x101): undefined reference to `__umoddi3'
vdso_standalone_test_x86.c:(.text+0x12d): undefined reference to `__udivdi3'

Commit adb19fb66ee (Documentation: add makefiles for more targets) is
now building this by default, so it's failing the kernel build entirely.

Switching the declaration from uint64_t to time_t does the right thing
and handles the x32 case automatically.

Signed-off-by: Jeff Mahoney 
---
 Documentation/vDSO/vdso_standalone_test_x86.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/vDSO/vdso_standalone_test_x86.c
+++ b/Documentation/vDSO/vdso_standalone_test_x86.c
@@ -63,7 +63,7 @@ static inline void linux_exit(int code)
x86_syscall3(__NR_exit, code, 0, 0);
 }
 
-void to_base10(char *lastdig, uint64_t n)
+void to_base10(char *lastdig, time_t n)
 {
while (n) {
*lastdig = (n % 10) + '0';

-- 
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] vdso: don't require 64-bit math in standalone test

2014-10-24 Thread Jeff Mahoney

The use of 64-bit math on i386 causes build failures:
vdso_standalone_test_x86.c:(.text+0x101): undefined reference to `__umoddi3'
vdso_standalone_test_x86.c:(.text+0x12d): undefined reference to `__udivdi3'

Commit adb19fb66ee (Documentation: add makefiles for more targets) is
now building this by default, so it's failing the kernel build entirely.

Switching the declaration from uint64_t to time_t does the right thing
and handles the x32 case automatically.

Signed-off-by: Jeff Mahoney je...@suse.com
---
 Documentation/vDSO/vdso_standalone_test_x86.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/vDSO/vdso_standalone_test_x86.c
+++ b/Documentation/vDSO/vdso_standalone_test_x86.c
@@ -63,7 +63,7 @@ static inline void linux_exit(int code)
x86_syscall3(__NR_exit, code, 0, 0);
 }
 
-void to_base10(char *lastdig, uint64_t n)
+void to_base10(char *lastdig, time_t n)
 {
while (n) {
*lastdig = (n % 10) + '0';

-- 
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: copy bi_vcnt in __bio_clone_fast

2014-10-09 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/9/14, 12:13 PM, Ming Lei wrote:
> On Thu, Oct 9, 2014 at 11:25 PM, Ming Lei 
> wrote:
>> On Thu, Oct 9, 2014 at 10:26 PM, Jeff Mahoney 
>> wrote:
>>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
>>> 
>>> On 10/9/14, 9:53 AM, Jeff Moyer wrote:
>>>> Jeff Mahoney  writes:
>>>> 
>>>>> Commit 05f1dd53152173 (block: add queue flag for disabling
>>>>> SG merging) uses bi_vcnt to assign bio->bi_phys_segments if
>>>>> sg merging is disabled. When using device mapper on top of
>>>>> a blk-mq device (virtio_blk in my test), we'd end up
>>>>> overflowing the scatterlist in __blk_bios_map_sg.
>>>>> 
>>>>> __bio_clone_fast copies bi_iter and bi_io_vec but not
>>>>> bi_vcnt, so blk_recount_segments would report
>>>>> bi_phys_segments as 0. Since rq->nr_phys_segments is 0 as
>>>>> well, the checks to ensure that we don't exceed the queue's
>>>>> segment limit end up allowing more bios (and segments) to
>>>>> attach the a request until we finally map it. That also
>>>>> means we pass the BUG_ON at the beginning of 
>>>>> virtio_queue_rq, ultimately causing memory corruption and
>>>>> a crash.
>>>>> 
>>>>> If we copy bi_vcnt in __bio_clone_fast, the bios and
>>>>> requests properly report the number of segments and
>>>>> everything works as expected.
>>>>> 
>>>>> Originally reported at 
>>>>> http://bugzilla.opensuse.org/show_bug.cgi?id=888259
>>>> 
>>>> Hi, Jeff,
>>>> 
>>>> Did you manage to reproduce this problem with commit 0738854 
>>>> (blk-merge: fix blk_recount_segments) applied?  Or perhaps
>>>> with commit 200612e (dm table: propagate
>>>> QUEUE_FLAG_NO_SG_MERGE)?
>>> 
>>> Yep. I was able to reproduce it with 3.17. I did try 0738854
>>> when I was still using 3.16 since it looked like a good
>>> candidate. Neither of those patches affect the problem here.
>>> bio->bi_phys_segments never gets a value set in the fast clone
>>> case and that translates to req->nr_phys_segments never getting
>>> properly accumulated. That might not be a problem except that
>>> the NO_SG_MERGE behavior bypasses the iteration that would come
>>> up with the correct value. In either case,
>> 
>> This patch may get incorrect rq->nr_phys_segments because bio
>> cloning is often used in case of I/O splitting, so could you test
>> if the attached patch fixes your problem?

Ah. Right.

> Please ignore last patch and test the attached patch since we still
> should use no sg merge to recalculate req's segments for cloned
> bio.

Yep, this fix works as expected.

Thanks,

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJUNsyuAAoJEB57S2MheeWyCSoQALkgpSfaWXEV89Sg0oaVbI/b
KWaHv1lLKICdkIpU+ujwsZtbxOzgwhG13CeWmLZArVvVp2eljciOdpI4vpuXVtOK
12QFj7ShLtgCt7dw/2zE8NP7tY2nAWznHhd8RXrphwM2XJ2APUVGbiUld2e+IqlI
jgdFrSOOVK+ZTUlDTkxjGPD6Q5zCJ7JDZ7cN7IWQB6yu115UoCJ73j9YhpfcDkRA
TZKk5/1NsCq11Gn+a5zCx9WMfauwNEeuwDFXPol8BMAK6BjucrZJjPDG2ykihuuI
EWUUOdnCmXgQgpW9ARrOr7ldfJO7OYL2u9WIgoBwMl1r23ZmSvIvMKIbUFuSwPSz
8yuJpzoH4SaaEO7e2ud2+aMK88Ih1Rr2uN009WPMYn5WXeCWlCBxaLmrFN4nlu3Y
Z/CoFFlhjsNr9Z7QzCv+u2SyEllA0l3/u+t3SNts4jhRReDGrEQIjs0ASvbKqNYZ
+c3hgrZDwv/xL1f7hBSblwIE0DgBn1rPJBfSRkSfrLzSi103lZrBQxpjN3wtjcFK
037/BfkUuzqSTG9MAdZoVkdw/iNh0Q3qfC7PjcT/EDKn4g30+DE04Sh6gDyojyvu
Z0HtRYUPjnlH0OBzvK8AyXmzLoEjkiyL4rSie1F2cy+qK/uRCUjyon8oCK9nfrga
6KWf1kYhmEAO17mN1y6y
=iDIX
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: copy bi_vcnt in __bio_clone_fast

2014-10-09 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/9/14, 9:53 AM, Jeff Moyer wrote:
> Jeff Mahoney  writes:
> 
>> Commit 05f1dd53152173 (block: add queue flag for disabling SG
>> merging) uses bi_vcnt to assign bio->bi_phys_segments if sg
>> merging is disabled. When using device mapper on top of a blk-mq
>> device (virtio_blk in my test), we'd end up overflowing the
>> scatterlist in __blk_bios_map_sg.
>> 
>> __bio_clone_fast copies bi_iter and bi_io_vec but not bi_vcnt,
>> so blk_recount_segments would report bi_phys_segments as 0.
>> Since rq->nr_phys_segments is 0 as well, the checks to ensure
>> that we don't exceed the queue's segment limit end up allowing
>> more bios (and segments) to attach the a request until we finally
>> map it. That also means we pass the BUG_ON at the beginning of
>> virtio_queue_rq, ultimately causing memory corruption and a
>> crash.
>> 
>> If we copy bi_vcnt in __bio_clone_fast, the bios and requests
>> properly report the number of segments and everything works as
>> expected.
>> 
>> Originally reported at
>> http://bugzilla.opensuse.org/show_bug.cgi?id=888259
> 
> Hi, Jeff,
> 
> Did you manage to reproduce this problem with commit 0738854
> (blk-merge: fix blk_recount_segments) applied?  Or perhaps with
> commit 200612e (dm table: propagate QUEUE_FLAG_NO_SG_MERGE)?

Yep. I was able to reproduce it with 3.17. I did try 0738854 when I
was still using 3.16 since it looked like a good candidate. Neither of
those patches affect the problem here. bio->bi_phys_segments never
gets a value set in the fast clone case and that translates to
req->nr_phys_segments never getting properly accumulated. That might
not be a problem except that the NO_SG_MERGE behavior bypasses the
iteration that would come up with the correct value. In either case,
it's still correct to copy bi_vcnt from the source bio since it's
describing the same bvec. Doing the iteration with no merging would
just end up with the same value. bio_clone_bioset builds up bi_vcnt as
it iterates over the number of segments, so that works fine. It's only
__bio_clone_fast that's broken.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJUNpsSAAoJEB57S2MheeWy5k4QAL2N5KQC3cAv+jYHal5qASaZ
5vr2A2VCofnz9ADV/CHhMNzcArcL1MBn/hJvXC8l/RZkEKY2GYaf4Uqu8LoYjPeJ
7rUrJGQ74H2IS63n7+qgBI5AlL+ndSsOruxiCSExIJPVu4t7MeKGnDKLApq9s7zl
JAh+ec+iJ5mT3Nth714zSyDVBFvBOWhZVm5uf/9QeLrFv9aBG4896u6P50SzA5zn
QU7r9sSwws7o30+2f2A9k57vEPm9DiqKQBgQoI0reZmrrs6YzP3PivLp6v2DLlED
WToOy3b5ZLw2GKeAgYFStQpSLvpwBqFdqjydzJ+ynZ+a07YATDuwj1iailSzY/g2
pCZaXazQ4jBJF5Qsy3C1IpN7OAQS1CBzQTEeLrYi0KQ3aM+VQDp9SV6s1vakFEVD
VsYGJxlUG8MvuPFfvUlqGvYtoudy5R0rdTCdIbWAAQlQosser+1u6mhyqHnSPqyW
Pja/hXmTsffASwqQYXd8HqEVZRDtaqX30dfuLYYWWGq9WBcLPpPFrYilwCwgFZ9C
8HGaMHAtEC3vAN8KBj5rVNgMILrxZlMXVqf+BTs+M9dq6Ed7Q+VYytK9WTM7G3Hy
WY/fhWgFoAOyaTHc1EmxbWjxb5wM2anK3zeiJAomAsZ/f/xhwnq/aLgu3x846Af5
V1v8uHhjR+HfHRn0wmaN
=bziR
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: copy bi_vcnt in __bio_clone_fast

2014-10-09 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/9/14, 9:53 AM, Jeff Moyer wrote:
 Jeff Mahoney je...@suse.com writes:
 
 Commit 05f1dd53152173 (block: add queue flag for disabling SG
 merging) uses bi_vcnt to assign bio-bi_phys_segments if sg
 merging is disabled. When using device mapper on top of a blk-mq
 device (virtio_blk in my test), we'd end up overflowing the
 scatterlist in __blk_bios_map_sg.
 
 __bio_clone_fast copies bi_iter and bi_io_vec but not bi_vcnt,
 so blk_recount_segments would report bi_phys_segments as 0.
 Since rq-nr_phys_segments is 0 as well, the checks to ensure
 that we don't exceed the queue's segment limit end up allowing
 more bios (and segments) to attach the a request until we finally
 map it. That also means we pass the BUG_ON at the beginning of
 virtio_queue_rq, ultimately causing memory corruption and a
 crash.
 
 If we copy bi_vcnt in __bio_clone_fast, the bios and requests
 properly report the number of segments and everything works as
 expected.
 
 Originally reported at
 http://bugzilla.opensuse.org/show_bug.cgi?id=888259
 
 Hi, Jeff,
 
 Did you manage to reproduce this problem with commit 0738854
 (blk-merge: fix blk_recount_segments) applied?  Or perhaps with
 commit 200612e (dm table: propagate QUEUE_FLAG_NO_SG_MERGE)?

Yep. I was able to reproduce it with 3.17. I did try 0738854 when I
was still using 3.16 since it looked like a good candidate. Neither of
those patches affect the problem here. bio-bi_phys_segments never
gets a value set in the fast clone case and that translates to
req-nr_phys_segments never getting properly accumulated. That might
not be a problem except that the NO_SG_MERGE behavior bypasses the
iteration that would come up with the correct value. In either case,
it's still correct to copy bi_vcnt from the source bio since it's
describing the same bvec. Doing the iteration with no merging would
just end up with the same value. bio_clone_bioset builds up bi_vcnt as
it iterates over the number of segments, so that works fine. It's only
__bio_clone_fast that's broken.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJUNpsSAAoJEB57S2MheeWy5k4QAL2N5KQC3cAv+jYHal5qASaZ
5vr2A2VCofnz9ADV/CHhMNzcArcL1MBn/hJvXC8l/RZkEKY2GYaf4Uqu8LoYjPeJ
7rUrJGQ74H2IS63n7+qgBI5AlL+ndSsOruxiCSExIJPVu4t7MeKGnDKLApq9s7zl
JAh+ec+iJ5mT3Nth714zSyDVBFvBOWhZVm5uf/9QeLrFv9aBG4896u6P50SzA5zn
QU7r9sSwws7o30+2f2A9k57vEPm9DiqKQBgQoI0reZmrrs6YzP3PivLp6v2DLlED
WToOy3b5ZLw2GKeAgYFStQpSLvpwBqFdqjydzJ+ynZ+a07YATDuwj1iailSzY/g2
pCZaXazQ4jBJF5Qsy3C1IpN7OAQS1CBzQTEeLrYi0KQ3aM+VQDp9SV6s1vakFEVD
VsYGJxlUG8MvuPFfvUlqGvYtoudy5R0rdTCdIbWAAQlQosser+1u6mhyqHnSPqyW
Pja/hXmTsffASwqQYXd8HqEVZRDtaqX30dfuLYYWWGq9WBcLPpPFrYilwCwgFZ9C
8HGaMHAtEC3vAN8KBj5rVNgMILrxZlMXVqf+BTs+M9dq6Ed7Q+VYytK9WTM7G3Hy
WY/fhWgFoAOyaTHc1EmxbWjxb5wM2anK3zeiJAomAsZ/f/xhwnq/aLgu3x846Af5
V1v8uHhjR+HfHRn0wmaN
=bziR
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] block: copy bi_vcnt in __bio_clone_fast

2014-10-09 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/9/14, 12:13 PM, Ming Lei wrote:
 On Thu, Oct 9, 2014 at 11:25 PM, Ming Lei ming@canonical.com
 wrote:
 On Thu, Oct 9, 2014 at 10:26 PM, Jeff Mahoney je...@suse.com
 wrote:
 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1
 
 On 10/9/14, 9:53 AM, Jeff Moyer wrote:
 Jeff Mahoney je...@suse.com writes:
 
 Commit 05f1dd53152173 (block: add queue flag for disabling
 SG merging) uses bi_vcnt to assign bio-bi_phys_segments if
 sg merging is disabled. When using device mapper on top of
 a blk-mq device (virtio_blk in my test), we'd end up
 overflowing the scatterlist in __blk_bios_map_sg.
 
 __bio_clone_fast copies bi_iter and bi_io_vec but not
 bi_vcnt, so blk_recount_segments would report
 bi_phys_segments as 0. Since rq-nr_phys_segments is 0 as
 well, the checks to ensure that we don't exceed the queue's
 segment limit end up allowing more bios (and segments) to
 attach the a request until we finally map it. That also
 means we pass the BUG_ON at the beginning of 
 virtio_queue_rq, ultimately causing memory corruption and
 a crash.
 
 If we copy bi_vcnt in __bio_clone_fast, the bios and
 requests properly report the number of segments and
 everything works as expected.
 
 Originally reported at 
 http://bugzilla.opensuse.org/show_bug.cgi?id=888259
 
 Hi, Jeff,
 
 Did you manage to reproduce this problem with commit 0738854 
 (blk-merge: fix blk_recount_segments) applied?  Or perhaps
 with commit 200612e (dm table: propagate
 QUEUE_FLAG_NO_SG_MERGE)?
 
 Yep. I was able to reproduce it with 3.17. I did try 0738854
 when I was still using 3.16 since it looked like a good
 candidate. Neither of those patches affect the problem here.
 bio-bi_phys_segments never gets a value set in the fast clone
 case and that translates to req-nr_phys_segments never getting
 properly accumulated. That might not be a problem except that
 the NO_SG_MERGE behavior bypasses the iteration that would come
 up with the correct value. In either case,
 
 This patch may get incorrect rq-nr_phys_segments because bio
 cloning is often used in case of I/O splitting, so could you test
 if the attached patch fixes your problem?

Ah. Right.

 Please ignore last patch and test the attached patch since we still
 should use no sg merge to recalculate req's segments for cloned
 bio.

Yep, this fix works as expected.

Thanks,

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJUNsyuAAoJEB57S2MheeWyCSoQALkgpSfaWXEV89Sg0oaVbI/b
KWaHv1lLKICdkIpU+ujwsZtbxOzgwhG13CeWmLZArVvVp2eljciOdpI4vpuXVtOK
12QFj7ShLtgCt7dw/2zE8NP7tY2nAWznHhd8RXrphwM2XJ2APUVGbiUld2e+IqlI
jgdFrSOOVK+ZTUlDTkxjGPD6Q5zCJ7JDZ7cN7IWQB6yu115UoCJ73j9YhpfcDkRA
TZKk5/1NsCq11Gn+a5zCx9WMfauwNEeuwDFXPol8BMAK6BjucrZJjPDG2ykihuuI
EWUUOdnCmXgQgpW9ARrOr7ldfJO7OYL2u9WIgoBwMl1r23ZmSvIvMKIbUFuSwPSz
8yuJpzoH4SaaEO7e2ud2+aMK88Ih1Rr2uN009WPMYn5WXeCWlCBxaLmrFN4nlu3Y
Z/CoFFlhjsNr9Z7QzCv+u2SyEllA0l3/u+t3SNts4jhRReDGrEQIjs0ASvbKqNYZ
+c3hgrZDwv/xL1f7hBSblwIE0DgBn1rPJBfSRkSfrLzSi103lZrBQxpjN3wtjcFK
037/BfkUuzqSTG9MAdZoVkdw/iNh0Q3qfC7PjcT/EDKn4g30+DE04Sh6gDyojyvu
Z0HtRYUPjnlH0OBzvK8AyXmzLoEjkiyL4rSie1F2cy+qK/uRCUjyon8oCK9nfrga
6KWf1kYhmEAO17mN1y6y
=iDIX
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] block: copy bi_vcnt in __bio_clone_fast

2014-10-08 Thread Jeff Mahoney

Commit 05f1dd53152173 (block: add queue flag for disabling SG merging) uses
bi_vcnt to assign bio->bi_phys_segments if sg merging is disabled. When
using device mapper on top of a blk-mq device (virtio_blk in my test),
we'd end up overflowing the scatterlist in __blk_bios_map_sg.

__bio_clone_fast copies bi_iter and bi_io_vec but not bi_vcnt, so
blk_recount_segments would report bi_phys_segments as 0. Since
rq->nr_phys_segments is 0 as well, the checks to ensure that we don't
exceed the queue's segment limit end up allowing more bios (and segments) to
attach the a request until we finally map it. That also means we
pass the BUG_ON at the beginning of virtio_queue_rq, ultimately causing
memory corruption and a crash.

If we copy bi_vcnt in __bio_clone_fast, the bios and requests properly
report the number of segments and everything works as expected.

Originally reported at http://bugzilla.opensuse.org/show_bug.cgi?id=888259

Reported-by: Stephen Kulow 
Signed-off-by: Jeff Mahoney 
---

 block/bio.c |1 +
 1 file changed, 1 insertion(+)

--- a/block/bio.c
+++ b/block/bio.c
@@ -564,6 +564,7 @@ void __bio_clone_fast(struct bio *bio, s
bio->bi_rw = bio_src->bi_rw;
bio->bi_iter = bio_src->bi_iter;
bio->bi_io_vec = bio_src->bi_io_vec;
+   bio->bi_vcnt = bio_src->bi_vcnt;
 }
 EXPORT_SYMBOL(__bio_clone_fast);
 

-- 
Jeff Mahoney
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] block: copy bi_vcnt in __bio_clone_fast

2014-10-08 Thread Jeff Mahoney

Commit 05f1dd53152173 (block: add queue flag for disabling SG merging) uses
bi_vcnt to assign bio-bi_phys_segments if sg merging is disabled. When
using device mapper on top of a blk-mq device (virtio_blk in my test),
we'd end up overflowing the scatterlist in __blk_bios_map_sg.

__bio_clone_fast copies bi_iter and bi_io_vec but not bi_vcnt, so
blk_recount_segments would report bi_phys_segments as 0. Since
rq-nr_phys_segments is 0 as well, the checks to ensure that we don't
exceed the queue's segment limit end up allowing more bios (and segments) to
attach the a request until we finally map it. That also means we
pass the BUG_ON at the beginning of virtio_queue_rq, ultimately causing
memory corruption and a crash.

If we copy bi_vcnt in __bio_clone_fast, the bios and requests properly
report the number of segments and everything works as expected.

Originally reported at http://bugzilla.opensuse.org/show_bug.cgi?id=888259

Reported-by: Stephen Kulow co...@suse.com
Signed-off-by: Jeff Mahoney je...@suse.com
---

 block/bio.c |1 +
 1 file changed, 1 insertion(+)

--- a/block/bio.c
+++ b/block/bio.c
@@ -564,6 +564,7 @@ void __bio_clone_fast(struct bio *bio, s
bio-bi_rw = bio_src-bi_rw;
bio-bi_iter = bio_src-bi_iter;
bio-bi_io_vec = bio_src-bi_io_vec;
+   bio-bi_vcnt = bio_src-bi_vcnt;
 }
 EXPORT_SYMBOL(__bio_clone_fast);
 

-- 
Jeff Mahoney
SUSE Labs

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux-3.16.2 queue (3.16.1+)

2014-09-10 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 9/6/14, 11:18 PM, Greg KH wrote:
> On Sun, Sep 07, 2014 at 02:47:55AM +0200, Matt wrote:
>> On Thu, Aug 28, 2014 at 9:18 PM, Matt 
>> wrote:
>>> On Thu, Aug 28, 2014 at 5:32 PM, Greg KH
>>>  wrote:
>>>> On Thu, Aug 28, 2014 at 05:27:27PM +0200, Matt wrote:
>>>>> On Thu, Aug 28, 2014 at 5:22 PM, Greg KH
>>>>>  wrote:
>>>>>> On Thu, Aug 28, 2014 at 05:16:58PM +0200, Matt wrote:
>>>>>>> Hi Greg,
>>>>>>> 
>>>>>>> 
>>>>>>> please consider adding the following 2 patches to
>>>>>>> 3.16.2:
>>>>>>> 
>>>>>>> Jan Kara (1): reiserfs: Fix use after free in journal
>>>>>>> teardown
>>>>>>> 
>>>>>>> Jeff Mahoney (1): reiserfs: fix corruption introduced
>>>>>>> by balance_leaf refactor
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Reason/Related:
>>>>>>> 
>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=83121
>>>>>>> 
>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=83321
>>>>>>> 
>>>>>>> http://forums.gentoo.org/viewtopic-t-998538-postdays-0-postorder-asc-start-0.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 
Many thanks in advance
>>>>>> 
>>>>>> I need git commit ids of these patches in Linus's tree,
>>>>>> can you provide those please?
>>>>>> 
>>>>>> thanks,
>>>>>> 
>>>>>> greg k-h
>>>>> 
>>>>> 
>>>>> Sure:
>>>>> 
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=27d0e5bc85f3341b9ba66f0c23627cf9d7538c9d
>>>>>
>>>>> 
reiserfs: fix corruption introduced by balance_leaf refactor
>>>>> 
>>>>> 
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01777836c87081e4f68c4a43c9abe6114805f91e
>>>>>
>>>>> 
reiserfs: Fix use after free in journal teardown
>>>>> 
>>>>> 
>>>>> 
>>>>> are checkpatch warnings usually also fixed within stable
>>>>> releases ?
>>>> 
>>>> No, not at all, please read
>>>> Documentation/stable_kernel_patches.txt for what is
>>>> acceptable for stable kernel patches.
>>>> 
>>>> thanks,
>>>> 
>>>> greg k-h
>>> 
>>> 
>>> okay, will do
>>> 
>>> thanks for pointing that out
>>> 
>>> 
>>> Regards
>>> 
>>> Matt
>> 
>> Hi Greg,
>> 
>> could you please add the above mentioned two patches
>> 
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=27d0e5bc85f3341b9ba66f0c23627cf9d7538c9d
>>
>> 
reiserfs: fix corruption introduced by balance_leaf refactor
>> 
>> 
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01777836c87081e4f68c4a43c9abe6114805f91e
>>
>> 
reiserfs: Fix use after free in journal teardown
>> 
>> in next stable (3.16.3) kernel ?
>> 
>> more and more people seem to be affected by the data corruption 
>> introduced by the recent changes.
>> 
>> 
>> Reading through Documentation/stable_kernel_rules.txt, 
>> http://cwe.mitre.org/data/definitions/416.html and 
>> http://www.hpenterprisesecurity.com/vulncat/en/vulncat/cpp/use_after_free.html
>>
>>
>> 
both patches seem relevant enough (concerning data integrity
>> filesystem-wise and security) to be included for the stable
>> branch
> 
> I'll queue this up when I get a chance, there are over 300 patches 
> pending for the stable kernels right now :(
> 
> Also, in the future, always cc sta...@vger.kernel.org for any
> stable requests so that they don't get lost.

Hi Greg -

27d0e5bc85f3341b9ba66f0c23627cf9d7538c9d
Author: Jeff Mahoney 
Date:   Mon Aug 4 19:51:47 2014 -0400

reiserfs: fix corruption introduced by balance_leaf refactor

Commits f1f007c308e (reiserfs: balance_leaf refactor, pull out
balance_leaf_insert_left) and cf22df182bf (reiserfs: balance_leaf
refactor, pull out balance_leaf_paste_left) missed that the

Re: linux-3.16.2 queue (3.16.1+)

2014-09-10 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 9/6/14, 11:18 PM, Greg KH wrote:
 On Sun, Sep 07, 2014 at 02:47:55AM +0200, Matt wrote:
 On Thu, Aug 28, 2014 at 9:18 PM, Matt jackdac...@gmail.com
 wrote:
 On Thu, Aug 28, 2014 at 5:32 PM, Greg KH
 gre...@linuxfoundation.org wrote:
 On Thu, Aug 28, 2014 at 05:27:27PM +0200, Matt wrote:
 On Thu, Aug 28, 2014 at 5:22 PM, Greg KH
 gre...@linuxfoundation.org wrote:
 On Thu, Aug 28, 2014 at 05:16:58PM +0200, Matt wrote:
 Hi Greg,
 
 
 please consider adding the following 2 patches to
 3.16.2:
 
 Jan Kara (1): reiserfs: Fix use after free in journal
 teardown
 
 Jeff Mahoney (1): reiserfs: fix corruption introduced
 by balance_leaf refactor
 
 
 
 Reason/Related:
 
 https://bugzilla.kernel.org/show_bug.cgi?id=83121
 
 https://bugzilla.kernel.org/show_bug.cgi?id=83321
 
 http://forums.gentoo.org/viewtopic-t-998538-postdays-0-postorder-asc-start-0.html



 
Many thanks in advance
 
 I need git commit ids of these patches in Linus's tree,
 can you provide those please?
 
 thanks,
 
 greg k-h
 
 
 Sure:
 
 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=27d0e5bc85f3341b9ba66f0c23627cf9d7538c9d

 
reiserfs: fix corruption introduced by balance_leaf refactor
 
 
 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01777836c87081e4f68c4a43c9abe6114805f91e

 
reiserfs: Fix use after free in journal teardown
 
 
 
 are checkpatch warnings usually also fixed within stable
 releases ?
 
 No, not at all, please read
 Documentation/stable_kernel_patches.txt for what is
 acceptable for stable kernel patches.
 
 thanks,
 
 greg k-h
 
 
 okay, will do
 
 thanks for pointing that out
 
 
 Regards
 
 Matt
 
 Hi Greg,
 
 could you please add the above mentioned two patches
 
 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=27d0e5bc85f3341b9ba66f0c23627cf9d7538c9d

 
reiserfs: fix corruption introduced by balance_leaf refactor
 
 
 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01777836c87081e4f68c4a43c9abe6114805f91e

 
reiserfs: Fix use after free in journal teardown
 
 in next stable (3.16.3) kernel ?
 
 more and more people seem to be affected by the data corruption 
 introduced by the recent changes.
 
 
 Reading through Documentation/stable_kernel_rules.txt, 
 http://cwe.mitre.org/data/definitions/416.html and 
 http://www.hpenterprisesecurity.com/vulncat/en/vulncat/cpp/use_after_free.html


 
both patches seem relevant enough (concerning data integrity
 filesystem-wise and security) to be included for the stable
 branch
 
 I'll queue this up when I get a chance, there are over 300 patches 
 pending for the stable kernels right now :(
 
 Also, in the future, always cc sta...@vger.kernel.org for any
 stable requests so that they don't get lost.

Hi Greg -

27d0e5bc85f3341b9ba66f0c23627cf9d7538c9d
Author: Jeff Mahoney je...@suse.com
Date:   Mon Aug 4 19:51:47 2014 -0400

reiserfs: fix corruption introduced by balance_leaf refactor

Commits f1f007c308e (reiserfs: balance_leaf refactor, pull out
balance_leaf_insert_left) and cf22df182bf (reiserfs: balance_leaf
refactor, pull out balance_leaf_paste_left) missed that the `body'
pointer was getting repositioned. Subsequent users of the pointer
would expect it to be repositioned, and as a result, parts of the
tree would get overwritten. The most common observed corruption
is indirect block pointers being overwritten.

Since the body value isn't actually used anymore in the called
routines,
we can pass back the offset it should be shifted. We constify the body
and ih pointers in the balance_leaf as a mostly-free preventative
measure.

Cc: sta...@vger.kernel.org # 3.16
Reported-and-tested-by: Jeff Chua jeff.chua.li...@gmail.com
Signed-off-by: Jeff Mahoney je...@suse.com
Signed-off-by: Jan Kara j...@suse.cz

Should there have been more? I thought it was enough to add the Cc
tag. This one has been in the tree, with the tags and with
corruption in the Subject since 13 Aug. I know you're busy but this
seems like a pretty obvious candidate for stable inclusion.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJUESUpAAoJEB57S2MheeWyt9oQAIEnvZPojErvzzv4IcvVheSI
Ju1XChkU4YDRW3W2e8PEjAhiPd1dMP7aEJvfq6AxlKYAYENaS/S2LdSbBbeVctFa
1VwBVakDkmHduVcb2hl3ldIQlHRW0w/q/fSk+NKZavANS/maIK/mj2HE8S3Op17C
iGsGZiluqaYp56yPHJK7XDorpWFoCVXIPlHUbec8lIxnyPqeytHo2W5UtfZZVeN3
BfGICzR57i7YjOtQ/lsmusiUjp7Ym4REKX1GGnIcZ1Po5F8oX4phMVaUR0gR1NSA
eYBcTyH245iWTQBFqE9D5AR0pHLnmi6EySEbNIWU3w0OYffDCBpqU7A7Dm5O2kng
caIlNuf4TMEp7QzVC8hxCL61nxBWJ6L2RQ9NkOg9zLHXdaWhJSjHl7TdRUPV/C3V
RzNCZEWvqEpMoju145Wez7JlcE/GlsBclNFGBqypEWN364B/MprKe5vhpeXJ+1H2
yUq/qKlgQLZe5uPCwMdcyAB3xTX8mIzG4nz8RWez6WPjhAlb82xtBSl0btWjSnVM
4YlWy/5jCWgyjXzrM3hd8P3SJi+l69rVUE+UcMvOqHq3oCFBddhlUh9tHM7pn9tH
sXTo8f8s9Pe7+HvbA0bwtwbwTQ8tNxn87ovuVnAO86RLmeeM7HCfqBU/4lKEZ0WE

Re: [PATCH] Kconfig: do not select SPI bus on sub-driver auto-select

2014-08-22 Thread Jeff Mahoney

On Fri Aug 22 19:34:12 2014, Randy Dunlap wrote:
> On 08/22/14 10:04, Jeff Mahoney wrote:
>> On Fri Aug 22 13:02:09 2014, Antti Palosaari wrote:
>>> We should not select SPI bus when sub-driver auto-select is
>>> selected. That option is meant for auto-selecting all possible
>>> ancillary drivers used for selected board driver. Ancillary
>>> drivers should define needed dependencies itself.
>>>
>>> I2C and I2C_MUX are still selected here for a reason described on
>>> commit 347f7a3763601d7b466898d1f10080b7083ac4a3
>>>
>>> Reverts commit e4462ffc1602d9df21c00a0381dca9080474e27a
>>>
>>> Reported-by: Jeff Mahoney 
>>> Signed-off-by: Antti Palosaari 
>>> ---
>>>  drivers/media/Kconfig | 1 -
>>>  1 file changed, 1 deletion(-)
>>>
>>> diff --git a/drivers/media/Kconfig b/drivers/media/Kconfig
>>> index f60bad4..3c89fcb 100644
>>> --- a/drivers/media/Kconfig
>>> +++ b/drivers/media/Kconfig
>>> @@ -182,7 +182,6 @@ config MEDIA_SUBDRV_AUTOSELECT
>>> depends on HAS_IOMEM
>>> select I2C
>>> select I2C_MUX
>>> -   select SPI
>>> default y
>>> help
>>>   By default, a media driver auto-selects all possible ancillary
>>
>> FWIW, in the patch I used locally, I also did a 'select SPI' in the
>> MSI2500 driver since it wouldn't otherwise be obvious that a USB device
>> depends on SPI.
>
> It already has depends on SPI.  That should be enough.
>

Yeah. My point was more that if you want support for that device, you'd 
have to know it uses SPI internally already.

-Jeff

--
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Kconfig: do not select SPI bus on sub-driver auto-select

2014-08-22 Thread Jeff Mahoney

On Fri Aug 22 13:02:09 2014, Antti Palosaari wrote:
> We should not select SPI bus when sub-driver auto-select is
> selected. That option is meant for auto-selecting all possible
> ancillary drivers used for selected board driver. Ancillary
> drivers should define needed dependencies itself.
>
> I2C and I2C_MUX are still selected here for a reason described on
> commit 347f7a3763601d7b466898d1f10080b7083ac4a3
>
> Reverts commit e4462ffc1602d9df21c00a0381dca9080474e27a
>
> Reported-by: Jeff Mahoney 
> Signed-off-by: Antti Palosaari 
> ---
>  drivers/media/Kconfig | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/media/Kconfig b/drivers/media/Kconfig
> index f60bad4..3c89fcb 100644
> --- a/drivers/media/Kconfig
> +++ b/drivers/media/Kconfig
> @@ -182,7 +182,6 @@ config MEDIA_SUBDRV_AUTOSELECT
>   depends on HAS_IOMEM
>   select I2C
>   select I2C_MUX
> - select SPI
>   default y
>   help
> By default, a media driver auto-selects all possible ancillary

FWIW, in the patch I used locally, I also did a 'select SPI' in the 
MSI2500 driver since it wouldn't otherwise be obvious that a USB device 
depends on SPI.

-Jeff

--
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Autoselecting SPI for MEDIA_SUBDRV_AUTOSELECT?

2014-08-22 Thread Jeff Mahoney

On Fri Aug 22 11:17:22 2014, Antti Palosaari wrote:
> Moikka!
>
> On 08/22/2014 06:00 PM, Jeff Mahoney wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> Hi Antti -
>>
>> Commit e4462ffc160 ([media] Kconfig: sub-driver auto-select SPI bus)
>> enables CONFIG_SPI globally for a driver that won't even be enabled in
>> many cases.
>>
>> Is there a reason USB_MSI2500 doesn't select SPI instead of
>> MEDIA_SUBDRV_AUTOSELECT?
>
> Nothing but I decided to set it similarly as I2C, another more common
> bus. IIRC same was for I2C_MUX too.
>
> You could still disable media subdriver autoselect and then disable
> SPI and select all the media drivers (excluding MSSi2500) manually.
>
> I have feeling that media auto-select was added to select everything
> needed for media.

Ok, that makes sense. I suppose I'll still need to enable SPI just for 
this device and disable every other SPI device anyway. I'll live.

Thanks,

-Jeff

--
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Autoselecting SPI for MEDIA_SUBDRV_AUTOSELECT?

2014-08-22 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Antti -

Commit e4462ffc160 ([media] Kconfig: sub-driver auto-select SPI bus)
enables CONFIG_SPI globally for a driver that won't even be enabled in
many cases.

Is there a reason USB_MSI2500 doesn't select SPI instead of
MEDIA_SUBDRV_AUTOSELECT?

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJT91smAAoJEB57S2MheeWytcQP/jVJNWUWQ414XaltxVyAF8XT
kyCCEXl/MslN8W36p6oMC6TX7KaXjQQSFVUltyt/UgcezuqhkU8nBUETEgLVou1n
1uF120zBu3gy3Hr+dG/3Awsdqb1UbrUrNM2LRL6sU5GIqYSdTxErL8inzXjj6/ow
MttC0yBf2aEumihLEfAqcmEPM0ryS9aZECEyclPA+KrYO9qJaicE2lg8JcSrg8yR
TyNYvRrQ2NOuh5thE4qDH/+jfwtZ4yuEXavIgLsZVuDL9+s11RsfOCxY6eLDPPer
qCI3r/2VYm4xMoF06FhOjafKvhhR97H4dm7jecW1zsGfwamrXMmB0+76IJkzxKpA
le7R5NHk9wag2d9zOLmWC2RQbAVspxV5idD/oq+6bSNLl2un89Pz2RMdZGln68db
z+Oz8W7OKhdQm7PyV2RhHNi67rcaGYt5m3fFkjp2RsylqZdQ7oO/zcMOiZxCxdqT
hGmsNXrgRDLoifA71BGrzyC2xmXYDzZ4K89W3msfW/h6TCNu1IWwqxs1qPucaHXk
xBXu5hn5+1sSmgDJAkg5ODum3hCt6PRdP54/D6hy4Ul9742HtBg5MzPEojSlqdLO
yF0ePAs7Ga8/xOGatCsz1aK8upFAOq6fvrakE2ibx5/0+Yyta9efR0X8q3oKskCr
Tc0Azhsxyhe1F/AObQW+
=7Eoh
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Kconfig: do not select SPI bus on sub-driver auto-select

2014-08-22 Thread Jeff Mahoney

On Fri Aug 22 19:34:12 2014, Randy Dunlap wrote:
 On 08/22/14 10:04, Jeff Mahoney wrote:
 On Fri Aug 22 13:02:09 2014, Antti Palosaari wrote:
 We should not select SPI bus when sub-driver auto-select is
 selected. That option is meant for auto-selecting all possible
 ancillary drivers used for selected board driver. Ancillary
 drivers should define needed dependencies itself.

 I2C and I2C_MUX are still selected here for a reason described on
 commit 347f7a3763601d7b466898d1f10080b7083ac4a3

 Reverts commit e4462ffc1602d9df21c00a0381dca9080474e27a

 Reported-by: Jeff Mahoney je...@suse.com
 Signed-off-by: Antti Palosaari cr...@iki.fi
 ---
  drivers/media/Kconfig | 1 -
  1 file changed, 1 deletion(-)

 diff --git a/drivers/media/Kconfig b/drivers/media/Kconfig
 index f60bad4..3c89fcb 100644
 --- a/drivers/media/Kconfig
 +++ b/drivers/media/Kconfig
 @@ -182,7 +182,6 @@ config MEDIA_SUBDRV_AUTOSELECT
 depends on HAS_IOMEM
 select I2C
 select I2C_MUX
 -   select SPI
 default y
 help
   By default, a media driver auto-selects all possible ancillary

 FWIW, in the patch I used locally, I also did a 'select SPI' in the
 MSI2500 driver since it wouldn't otherwise be obvious that a USB device
 depends on SPI.

 It already has depends on SPI.  That should be enough.


Yeah. My point was more that if you want support for that device, you'd 
have to know it uses SPI internally already.

-Jeff

--
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Autoselecting SPI for MEDIA_SUBDRV_AUTOSELECT?

2014-08-22 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Antti -

Commit e4462ffc160 ([media] Kconfig: sub-driver auto-select SPI bus)
enables CONFIG_SPI globally for a driver that won't even be enabled in
many cases.

Is there a reason USB_MSI2500 doesn't select SPI instead of
MEDIA_SUBDRV_AUTOSELECT?

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJT91smAAoJEB57S2MheeWytcQP/jVJNWUWQ414XaltxVyAF8XT
kyCCEXl/MslN8W36p6oMC6TX7KaXjQQSFVUltyt/UgcezuqhkU8nBUETEgLVou1n
1uF120zBu3gy3Hr+dG/3Awsdqb1UbrUrNM2LRL6sU5GIqYSdTxErL8inzXjj6/ow
MttC0yBf2aEumihLEfAqcmEPM0ryS9aZECEyclPA+KrYO9qJaicE2lg8JcSrg8yR
TyNYvRrQ2NOuh5thE4qDH/+jfwtZ4yuEXavIgLsZVuDL9+s11RsfOCxY6eLDPPer
qCI3r/2VYm4xMoF06FhOjafKvhhR97H4dm7jecW1zsGfwamrXMmB0+76IJkzxKpA
le7R5NHk9wag2d9zOLmWC2RQbAVspxV5idD/oq+6bSNLl2un89Pz2RMdZGln68db
z+Oz8W7OKhdQm7PyV2RhHNi67rcaGYt5m3fFkjp2RsylqZdQ7oO/zcMOiZxCxdqT
hGmsNXrgRDLoifA71BGrzyC2xmXYDzZ4K89W3msfW/h6TCNu1IWwqxs1qPucaHXk
xBXu5hn5+1sSmgDJAkg5ODum3hCt6PRdP54/D6hy4Ul9742HtBg5MzPEojSlqdLO
yF0ePAs7Ga8/xOGatCsz1aK8upFAOq6fvrakE2ibx5/0+Yyta9efR0X8q3oKskCr
Tc0Azhsxyhe1F/AObQW+
=7Eoh
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Autoselecting SPI for MEDIA_SUBDRV_AUTOSELECT?

2014-08-22 Thread Jeff Mahoney

On Fri Aug 22 11:17:22 2014, Antti Palosaari wrote:
 Moikka!

 On 08/22/2014 06:00 PM, Jeff Mahoney wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi Antti -

 Commit e4462ffc160 ([media] Kconfig: sub-driver auto-select SPI bus)
 enables CONFIG_SPI globally for a driver that won't even be enabled in
 many cases.

 Is there a reason USB_MSI2500 doesn't select SPI instead of
 MEDIA_SUBDRV_AUTOSELECT?

 Nothing but I decided to set it similarly as I2C, another more common
 bus. IIRC same was for I2C_MUX too.

 You could still disable media subdriver autoselect and then disable
 SPI and select all the media drivers (excluding MSSi2500) manually.

 I have feeling that media auto-select was added to select everything
 needed for media.

Ok, that makes sense. I suppose I'll still need to enable SPI just for 
this device and disable every other SPI device anyway. I'll live.

Thanks,

-Jeff

--
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Kconfig: do not select SPI bus on sub-driver auto-select

2014-08-22 Thread Jeff Mahoney

On Fri Aug 22 13:02:09 2014, Antti Palosaari wrote:
 We should not select SPI bus when sub-driver auto-select is
 selected. That option is meant for auto-selecting all possible
 ancillary drivers used for selected board driver. Ancillary
 drivers should define needed dependencies itself.

 I2C and I2C_MUX are still selected here for a reason described on
 commit 347f7a3763601d7b466898d1f10080b7083ac4a3

 Reverts commit e4462ffc1602d9df21c00a0381dca9080474e27a

 Reported-by: Jeff Mahoney je...@suse.com
 Signed-off-by: Antti Palosaari cr...@iki.fi
 ---
  drivers/media/Kconfig | 1 -
  1 file changed, 1 deletion(-)

 diff --git a/drivers/media/Kconfig b/drivers/media/Kconfig
 index f60bad4..3c89fcb 100644
 --- a/drivers/media/Kconfig
 +++ b/drivers/media/Kconfig
 @@ -182,7 +182,6 @@ config MEDIA_SUBDRV_AUTOSELECT
   depends on HAS_IOMEM
   select I2C
   select I2C_MUX
 - select SPI
   default y
   help
 By default, a media driver auto-selects all possible ancillary

FWIW, in the patch I used locally, I also did a 'select SPI' in the 
MSI2500 driver since it wouldn't otherwise be obvious that a USB device 
depends on SPI.

-Jeff

--
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] rtsx_usb: export device table

2014-08-08 Thread Jeff Mahoney

The rtsx_usb driver contains the table for the devices it supports but
doesn't export it. As a result, no alias is generated and it doesn't
get loaded automatically.

Via https://bugzilla.novell.com/show_bug.cgi?id=890096

Reported-by: Marcel Witte 
Signed-off-by: Jeff Mahoney 
---
 drivers/mfd/rtsx_usb.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/mfd/rtsx_usb.c
+++ b/drivers/mfd/rtsx_usb.c
@@ -744,6 +744,7 @@ static struct usb_device_id rtsx_usb_usb
{ USB_DEVICE(0x0BDA, 0x0140) },
{ }
 };
+MODULE_DEVICE_TABLE(usb, rtsx_usb_usb_ids);
 
 static struct usb_driver rtsx_usb_driver = {
.name   = "rtsx_usb",


-- 
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] rtsx_usb: export device table

2014-08-08 Thread Jeff Mahoney

The rtsx_usb driver contains the table for the devices it supports but
doesn't export it. As a result, no alias is generated and it doesn't
get loaded automatically.

Via https://bugzilla.novell.com/show_bug.cgi?id=890096

Reported-by: Marcel Witte witte...@googlemail.com
Signed-off-by: Jeff Mahoney je...@suse.com
---
 drivers/mfd/rtsx_usb.c |1 +
 1 file changed, 1 insertion(+)

--- a/drivers/mfd/rtsx_usb.c
+++ b/drivers/mfd/rtsx_usb.c
@@ -744,6 +744,7 @@ static struct usb_device_id rtsx_usb_usb
{ USB_DEVICE(0x0BDA, 0x0140) },
{ }
 };
+MODULE_DEVICE_TABLE(usb, rtsx_usb_usb_ids);
 
 static struct usb_driver rtsx_usb_driver = {
.name   = rtsx_usb,


-- 
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] Reiserfs & ext3 changes for 3.16-rc1

2014-06-09 Thread Jeff Mahoney

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 6/7/14, 7:52 PM, Linus Torvalds wrote:
> On Wed, Jun 4, 2014 at 1:27 PM, Jan Kara  wrote:
>> Hello Linus,
>> 
>> could you please pull from
>> 
>> git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git 
>> for_linus
>> 
>> to get big reiserfs cleanup from Jeff, an ext3 deadlock fix, and 
>> some small cleanups.
> 
> This does not work at all for me:
> 
> fs/reiserfs/do_balan.c: In function ‘balance_leaf_new_nodes’: 
> fs/reiserfs/do_balan.c:1248:3: error: implicit declaration of 
> function ‘format_bh’ [-Werror=implicit-function-declaration] 
> RFALSE(!buffer_journaled(tb->S_new[i]) ^
> 
> Hmm? "format_bh()"?

Weird. That's part of a follow-on patchset. One of the patches gets
rid of do_reiserfs_warning printing into a static buffer w/o a lock. I
must've forgotten a 'quilt add' when adding that chunk to the newer
patch. I'll review what else, if anything, might've changed but for
now, that should be be s/format_bh(tb->S_new[i])/tb->S_new[i]/.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)

iQIcBAEBAgAGBQJTlhSqAAoJEB57S2MheeWyy+IP/iYBJbocG4hF8364ognOfyX+
Q1wEzuNvRbZkplNueI1spfvdSJdJ2xMkWYXhaX2WwjQtG+pibLTUbFe8fgEh4egL
0yxtkENeJye0RsCJVALH25iu3iL0u6e03phM6FyvgkRra1jOWSMXJTAlV8ZpQs3o
EXBQ5qoojOaWc3yGmYEugXkLOp5JyJZn0S9EbWqBKw9v8e01kz1zxzfUAoiYb6Ug
hhoXiP2MOhTsgGgMJcCuk2cPNZ2CmHBlv++hGmyA/Dbne1oi+TacaHilwXI2xNI+
fU0IgSVdVC4avZDXXAXLbz0oBLz4/ba/OIiFN8htipNU9wUBJ+GFawpEZIqZ8iTz
WEptpUhcjZdc2SXJxuajjf2Am4VmTt3Rv/wg8kvopD0ANT9sJxLPuKENFqqiusVe
ugE7tL4JkHbz4vm4cbNu3YP7LfXtq/s3EpwiW5SJMVcKURaOriyYrOYbdJ73yKwb
P1K7sj+z3zKmPVuRrbDOgyVKMP1xxb8/ockJVawRUx4vZcn08Nu5f7ZofPewW/mT
XVHDe+YFC+IGLqJwudE0j6O6ZAWZTt1uJeg6IoFenuJhZACZiT/dhR2r5z/kCclk
0VFpLhKR+xy8Q7BB3gDWYZ/NFS7RA8gEu0pbIOj36LUyERX4UeUOsH+oWYoxLvwj
sazZbiauomfvBVZxu4+f
=4Lqx
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 301 matches

Mail list logo