Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-28 Thread Rafael J. Wysocki
On Thursday, January 28, 2016 07:45:53 AM Viresh Kumar wrote:
> On 27-01-16, 23:54, Rafael J. Wysocki wrote:
> > So I've applied this, but I'm not sure it is sufficient yet.
> 
> At least, this solves the crash Juri was hitting on a multi cluster
> box.

Yes, it makes the crash go away in his setup.

> > Have you double checked whether or not stuff cannot be reordered by
> > the CPU and/or the compiler and no additional memory barriers are needed?
> 
> I don't think CPU will reorder things before a function call.

It can do that in theory.

First of all, functions may be inlined by the compiler.

Second, even if they aren't, the call instruction only means "take the next
instruction from that other location in memory" to the CPU and the instructions
following the call go into the pipeline along with the ones preceding it and
they may be reordered in the process.

> It can reorder lines,

Not lines, but instructions.

> which CPU thinks aren't related but it can't assume the
> same in this case. We have tons of code like this.

Code that relies on specific ordering of instructions executed by different
CPUs for correctness usually requires memory barriers.

Thanks,
Rafael



Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-28 Thread Juri Lelli
On 28/01/16 07:45, Viresh Kumar wrote:
> On 27-01-16, 23:54, Rafael J. Wysocki wrote:
> > So I've applied this, but I'm not sure it is sufficient yet.
> 
> At least, this solves the crash Juri was hitting on a multi cluster
> box.
> 
> > Have you double checked whether or not stuff cannot be reordered by
> > the CPU and/or the compiler and no additional memory barriers are needed?
> 
> I don't think CPU will reorder things before a function call. It can
> reorder lines, which CPU thinks aren't related but it can't assume the
> same in this case. We have tons of code like this.
> 
> @Juri: What do you say? 
> 

Yeah, it looks good on my boxes (even though I'll run some more tests
later today). I'm not entirely sure either about the reordering, but
reordering across a function call (of a different compilation unit)
seems quite unlikely to me as well.

Best,

- Juri


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-28 Thread Juri Lelli
On 28/01/16 07:45, Viresh Kumar wrote:
> On 27-01-16, 23:54, Rafael J. Wysocki wrote:
> > So I've applied this, but I'm not sure it is sufficient yet.
> 
> At least, this solves the crash Juri was hitting on a multi cluster
> box.
> 
> > Have you double checked whether or not stuff cannot be reordered by
> > the CPU and/or the compiler and no additional memory barriers are needed?
> 
> I don't think CPU will reorder things before a function call. It can
> reorder lines, which CPU thinks aren't related but it can't assume the
> same in this case. We have tons of code like this.
> 
> @Juri: What do you say? 
> 

Yeah, it looks good on my boxes (even though I'll run some more tests
later today). I'm not entirely sure either about the reordering, but
reordering across a function call (of a different compilation unit)
seems quite unlikely to me as well.

Best,

- Juri


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-28 Thread Rafael J. Wysocki
On Thursday, January 28, 2016 07:45:53 AM Viresh Kumar wrote:
> On 27-01-16, 23:54, Rafael J. Wysocki wrote:
> > So I've applied this, but I'm not sure it is sufficient yet.
> 
> At least, this solves the crash Juri was hitting on a multi cluster
> box.

Yes, it makes the crash go away in his setup.

> > Have you double checked whether or not stuff cannot be reordered by
> > the CPU and/or the compiler and no additional memory barriers are needed?
> 
> I don't think CPU will reorder things before a function call.

It can do that in theory.

First of all, functions may be inlined by the compiler.

Second, even if they aren't, the call instruction only means "take the next
instruction from that other location in memory" to the CPU and the instructions
following the call go into the pipeline along with the ones preceding it and
they may be reordered in the process.

> It can reorder lines,

Not lines, but instructions.

> which CPU thinks aren't related but it can't assume the
> same in this case. We have tons of code like this.

Code that relies on specific ordering of instructions executed by different
CPUs for correctness usually requires memory barriers.

Thanks,
Rafael



Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Viresh Kumar
On 27-01-16, 23:54, Rafael J. Wysocki wrote:
> So I've applied this, but I'm not sure it is sufficient yet.

At least, this solves the crash Juri was hitting on a multi cluster
box.

> Have you double checked whether or not stuff cannot be reordered by
> the CPU and/or the compiler and no additional memory barriers are needed?

I don't think CPU will reorder things before a function call. It can
reorder lines, which CPU thinks aren't related but it can't assume the
same in this case. We have tons of code like this.

@Juri: What do you say? 

-- 
viresh


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Rafael J. Wysocki
On Monday, January 25, 2016 10:33:46 PM Viresh Kumar wrote:
> There is a little race discovered by Juri, where we are able to:
> - create and read a sysfs file before policy->governor_data is being set
>   to a non NULL value.
>   OR
> - set policy->governor_data to NULL, and reading a file before being
>   destroyed.
> 
> And so such a crash is reported:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> pgd = edfc8000
> [000c] *pgd=bfc8c835
> Internal error: Oops: 17 [#1] SMP ARM
> Modules linked in:
> CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
> Hardware name: ARM-Versatile Express
> task: ee8e8480 ti: ee93 task.ti: ee93
> PC is at show_ignore_nice_load_gov_pol+0x24/0x34
> LR is at show+0x4c/0x60
> pc : []lr : []psr: a0070013
> sp : ee931dd0  ip : ee931de0  fp : ee931ddc
> r10: ee4bc290  r9 : 1000  r8 : ef2cb000
> r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
> r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: adfc806a  DAC: 0051
> Process cat (pid: 1730, stack limit = 0xee930210)
> Stack: (0xee931dd0 to 0xee932000)
> 1dc0: ee931dfc ee931de0 c058ae88 c058f1a4
> 1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 c058ae48
> 1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 ee931e28
> 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
> 1e40:  edce3bf0 befe4a00 ee931f78   01e4 
> 1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  1000
> 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
> 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 ee931f78
> 1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
> 1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
> 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 ee8ff480
> 1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 ee931f48
> 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 befe4a00
> 1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
> 1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  ee931fa8
> 1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> 1fc0: 000a9f20 1000 befe4a00 0003   0003 0001
> pgd = edfc4000
> [000c] *pgd=bfcac835
> 1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 134ac2c9
> 
> [] (show_ignore_nice_load_gov_pol) from [] 
> (show+0x4c/0x60)
> [] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
> [] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x34/0x38)
> [] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
> [] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
> [] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
> [] (__vfs_read) from [] (vfs_read+0x98/0x104)
> [] (vfs_read) from [] (SyS_read+0x50/0x90)
> [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
> Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
> ---[ end trace 5994b9a5111f35ee ]---
> 
> Fix that by making sure, policy->governor_data is updated at the right
> places only.
> 
> Cc:  # v4.2+
> Reported-by: Juri Lelli 
> Signed-off-by: Viresh Kumar 

So I've applied this, but I'm not sure it is sufficient yet.

Have you double checked whether or not stuff cannot be reordered by
the CPU and/or the compiler and no additional memory barriers are needed?

Rafael



Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Juri Lelli
On 26/01/16 23:49, Rafael J. Wysocki wrote:
> On Tuesday, January 26, 2016 06:01:19 PM Juri Lelli wrote:
> > On 26/01/16 09:57, Juri Lelli wrote:
> > > Hi Viresh,
> > > 
> > > On 25/01/16 22:33, Viresh Kumar wrote:
> > > > There is a little race discovered by Juri, where we are able to:
> > > > - create and read a sysfs file before policy->governor_data is being set
> > > >   to a non NULL value.
> > > >   OR
> > > > - set policy->governor_data to NULL, and reading a file before being
> > > >   destroyed.
> > > > 
> 
> [cut]
> 
> > 
> > So, this goes away with your patch (that I forward ported) and a small
> > additional fix on top of that.
> 
> Which patch exactly is that?
> 

As Viresh said, this is:

 cpufreq: Access governor's sysfs attributes without 'policy->rwsem'
 
http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93

Apologies for the confusion; I was already talking with Viresh on IRC
about it.

Best,

- Juri


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Juri Lelli
On 27/01/16 08:40, Viresh Kumar wrote:
> On 26-01-16, 09:57, Juri Lelli wrote:
> > This patch fixes the crash I was seeing.
> > 
> > Tested-by: Juri Lelli 
> 
> Thanks.
> 
> > However, it exposes another problem (running the concurrent lockdep test
> 
> It exposes? How can this patch expose the below crash. AFAIR, you
> reported that you are getting below crash on plain mainline on TC2,
> i.e. for drivers with policy-per-governor set.
> 

Oh, simply because, without the NULL ref fix, I couldn't actually run
the test. Sorry if I was not clear.

> The reason is obvious, as the governor's sysfs directory is present
> cpus/cpuX/cpufreq/ instead of cpus/cpufreq/, which used to be the case
> without the flag. And this forces the show()/store() present in
> cpufreq.c to be called which also take policy->rwsem.
> 
> > that you merged in your tests). After the test is finished there is
> > always at least one task spinning. Do you think it might be related to
> > the race we are already discussing in the thread related to my cleanups
> > patches? This is what I see:
> 
> So this is what you reported earlier, right?
> 

Yep, same thing.

> > [   38.843648] other info that might help us debug this:
> > [   38.843648]
> > [   38.867627] Chain exists of:
> >   s_active#41 --> >rwsem --> od_dbs_cdata.mutex
> > 
> > [   38.891693]  Possible unsafe locking scenario:
> > [   38.891693]
> 
> Will elaborate it a bit here..
> - CPU0 is calling governor's EXIT()
> - CPU1 is reading a governor file from sysfs
> 
> > [   38.909419]CPU0CPU1
> > [   38.922978]
> 
> Following needs to be added here..
> 
>EXIT-governorread/write governor file
> 
> lock(s_active#41);
> 
> > [   38.936535]   lock(od_dbs_cdata.mutex);
> > [   38.948146]lock(>rwsem);
> > [   38.966168]lock(od_dbs_cdata.mutex);
> > [   38.985219]   lock(s_active#41);
> > [   38.994923]
> > [   38.994923]  *** DEADLOCK ***
> 
> > Now, you already pointed me at a possible fix. I'm going to test that
> > (even if I have questions about that patch :)) and see if it makes this
> > go away. 
> 
> @Rafael: Juri is talking about this patch:
> 
> http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93
> 

Right. Thanks for pointing Rafael to it.

> Juri, I thought it will fix it earlier (when I wrote it), but it never
> did on x86 (while I dropped the rwsem-drop-code around EXIT as well).
> 
> And I never came back to it and so never sent it upstream.
> 

kbuild robot didn't report anything bad yet. I'll run some more tests on
my x86 box anyway.

Best,

- Juri


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Viresh Kumar
On 27-01-16, 23:54, Rafael J. Wysocki wrote:
> So I've applied this, but I'm not sure it is sufficient yet.

At least, this solves the crash Juri was hitting on a multi cluster
box.

> Have you double checked whether or not stuff cannot be reordered by
> the CPU and/or the compiler and no additional memory barriers are needed?

I don't think CPU will reorder things before a function call. It can
reorder lines, which CPU thinks aren't related but it can't assume the
same in this case. We have tons of code like this.

@Juri: What do you say? 

-- 
viresh


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Rafael J. Wysocki
On Monday, January 25, 2016 10:33:46 PM Viresh Kumar wrote:
> There is a little race discovered by Juri, where we are able to:
> - create and read a sysfs file before policy->governor_data is being set
>   to a non NULL value.
>   OR
> - set policy->governor_data to NULL, and reading a file before being
>   destroyed.
> 
> And so such a crash is reported:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> pgd = edfc8000
> [000c] *pgd=bfc8c835
> Internal error: Oops: 17 [#1] SMP ARM
> Modules linked in:
> CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
> Hardware name: ARM-Versatile Express
> task: ee8e8480 ti: ee93 task.ti: ee93
> PC is at show_ignore_nice_load_gov_pol+0x24/0x34
> LR is at show+0x4c/0x60
> pc : []lr : []psr: a0070013
> sp : ee931dd0  ip : ee931de0  fp : ee931ddc
> r10: ee4bc290  r9 : 1000  r8 : ef2cb000
> r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
> r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: adfc806a  DAC: 0051
> Process cat (pid: 1730, stack limit = 0xee930210)
> Stack: (0xee931dd0 to 0xee932000)
> 1dc0: ee931dfc ee931de0 c058ae88 c058f1a4
> 1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 c058ae48
> 1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 ee931e28
> 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
> 1e40:  edce3bf0 befe4a00 ee931f78   01e4 
> 1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  1000
> 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
> 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 ee931f78
> 1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
> 1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
> 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 ee8ff480
> 1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 ee931f48
> 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 befe4a00
> 1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
> 1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  ee931fa8
> 1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> 1fc0: 000a9f20 1000 befe4a00 0003   0003 0001
> pgd = edfc4000
> [000c] *pgd=bfcac835
> 1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 134ac2c9
> 
> [] (show_ignore_nice_load_gov_pol) from [] 
> (show+0x4c/0x60)
> [] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
> [] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x34/0x38)
> [] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
> [] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
> [] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
> [] (__vfs_read) from [] (vfs_read+0x98/0x104)
> [] (vfs_read) from [] (SyS_read+0x50/0x90)
> [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
> Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
> ---[ end trace 5994b9a5111f35ee ]---
> 
> Fix that by making sure, policy->governor_data is updated at the right
> places only.
> 
> Cc:  # v4.2+
> Reported-by: Juri Lelli 
> Signed-off-by: Viresh Kumar 

So I've applied this, but I'm not sure it is sufficient yet.

Have you double checked whether or not stuff cannot be reordered by
the CPU and/or the compiler and no additional memory barriers are needed?

Rafael



Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Juri Lelli
On 26/01/16 23:49, Rafael J. Wysocki wrote:
> On Tuesday, January 26, 2016 06:01:19 PM Juri Lelli wrote:
> > On 26/01/16 09:57, Juri Lelli wrote:
> > > Hi Viresh,
> > > 
> > > On 25/01/16 22:33, Viresh Kumar wrote:
> > > > There is a little race discovered by Juri, where we are able to:
> > > > - create and read a sysfs file before policy->governor_data is being set
> > > >   to a non NULL value.
> > > >   OR
> > > > - set policy->governor_data to NULL, and reading a file before being
> > > >   destroyed.
> > > > 
> 
> [cut]
> 
> > 
> > So, this goes away with your patch (that I forward ported) and a small
> > additional fix on top of that.
> 
> Which patch exactly is that?
> 

As Viresh said, this is:

 cpufreq: Access governor's sysfs attributes without 'policy->rwsem'
 
http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93

Apologies for the confusion; I was already talking with Viresh on IRC
about it.

Best,

- Juri


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-27 Thread Juri Lelli
On 27/01/16 08:40, Viresh Kumar wrote:
> On 26-01-16, 09:57, Juri Lelli wrote:
> > This patch fixes the crash I was seeing.
> > 
> > Tested-by: Juri Lelli 
> 
> Thanks.
> 
> > However, it exposes another problem (running the concurrent lockdep test
> 
> It exposes? How can this patch expose the below crash. AFAIR, you
> reported that you are getting below crash on plain mainline on TC2,
> i.e. for drivers with policy-per-governor set.
> 

Oh, simply because, without the NULL ref fix, I couldn't actually run
the test. Sorry if I was not clear.

> The reason is obvious, as the governor's sysfs directory is present
> cpus/cpuX/cpufreq/ instead of cpus/cpufreq/, which used to be the case
> without the flag. And this forces the show()/store() present in
> cpufreq.c to be called which also take policy->rwsem.
> 
> > that you merged in your tests). After the test is finished there is
> > always at least one task spinning. Do you think it might be related to
> > the race we are already discussing in the thread related to my cleanups
> > patches? This is what I see:
> 
> So this is what you reported earlier, right?
> 

Yep, same thing.

> > [   38.843648] other info that might help us debug this:
> > [   38.843648]
> > [   38.867627] Chain exists of:
> >   s_active#41 --> >rwsem --> od_dbs_cdata.mutex
> > 
> > [   38.891693]  Possible unsafe locking scenario:
> > [   38.891693]
> 
> Will elaborate it a bit here..
> - CPU0 is calling governor's EXIT()
> - CPU1 is reading a governor file from sysfs
> 
> > [   38.909419]CPU0CPU1
> > [   38.922978]
> 
> Following needs to be added here..
> 
>EXIT-governorread/write governor file
> 
> lock(s_active#41);
> 
> > [   38.936535]   lock(od_dbs_cdata.mutex);
> > [   38.948146]lock(>rwsem);
> > [   38.966168]lock(od_dbs_cdata.mutex);
> > [   38.985219]   lock(s_active#41);
> > [   38.994923]
> > [   38.994923]  *** DEADLOCK ***
> 
> > Now, you already pointed me at a possible fix. I'm going to test that
> > (even if I have questions about that patch :)) and see if it makes this
> > go away. 
> 
> @Rafael: Juri is talking about this patch:
> 
> http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93
> 

Right. Thanks for pointing Rafael to it.

> Juri, I thought it will fix it earlier (when I wrote it), but it never
> did on x86 (while I dropped the rwsem-drop-code around EXIT as well).
> 
> And I never came back to it and so never sent it upstream.
> 

kbuild robot didn't report anything bad yet. I'll run some more tests on
my x86 box anyway.

Best,

- Juri


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Viresh Kumar
On 26-01-16, 18:01, Juri Lelli wrote:
> So, this goes away with your patch (that I forward ported) and a small
> additional fix on top of that. I pushed all that here (so that it is
> also tested by 0-day):

I am surprised :)

>  git://linux-arm.org/linux-jl.git fixes/cpufreq/policy_exit_race 
> 
> However, I can still see what below :/.

That's bad :(

-- 
viresh


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Viresh Kumar
On 26-01-16, 09:57, Juri Lelli wrote:
> This patch fixes the crash I was seeing.
> 
> Tested-by: Juri Lelli 

Thanks.

> However, it exposes another problem (running the concurrent lockdep test

It exposes? How can this patch expose the below crash. AFAIR, you
reported that you are getting below crash on plain mainline on TC2,
i.e. for drivers with policy-per-governor set.

The reason is obvious, as the governor's sysfs directory is present
cpus/cpuX/cpufreq/ instead of cpus/cpufreq/, which used to be the case
without the flag. And this forces the show()/store() present in
cpufreq.c to be called which also take policy->rwsem.

> that you merged in your tests). After the test is finished there is
> always at least one task spinning. Do you think it might be related to
> the race we are already discussing in the thread related to my cleanups
> patches? This is what I see:

So this is what you reported earlier, right?

> [   38.843648] other info that might help us debug this:
> [   38.843648]
> [   38.867627] Chain exists of:
>   s_active#41 --> >rwsem --> od_dbs_cdata.mutex
> 
> [   38.891693]  Possible unsafe locking scenario:
> [   38.891693]

Will elaborate it a bit here..
- CPU0 is calling governor's EXIT()
- CPU1 is reading a governor file from sysfs

> [   38.909419]CPU0CPU1
> [   38.922978]

Following needs to be added here..

   EXIT-governorread/write governor file

lock(s_active#41);

> [   38.936535]   lock(od_dbs_cdata.mutex);
> [   38.948146]lock(>rwsem);
> [   38.966168]lock(od_dbs_cdata.mutex);
> [   38.985219]   lock(s_active#41);
> [   38.994923]
> [   38.994923]  *** DEADLOCK ***

> Now, you already pointed me at a possible fix. I'm going to test that
> (even if I have questions about that patch :)) and see if it makes this
> go away. 

@Rafael: Juri is talking about this patch:

http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93

Juri, I thought it will fix it earlier (when I wrote it), but it never
did on x86 (while I dropped the rwsem-drop-code around EXIT as well).

And I never came back to it and so never sent it upstream.

-- 
viresh


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Rafael J. Wysocki
On Tuesday, January 26, 2016 06:01:19 PM Juri Lelli wrote:
> On 26/01/16 09:57, Juri Lelli wrote:
> > Hi Viresh,
> > 
> > On 25/01/16 22:33, Viresh Kumar wrote:
> > > There is a little race discovered by Juri, where we are able to:
> > > - create and read a sysfs file before policy->governor_data is being set
> > >   to a non NULL value.
> > >   OR
> > > - set policy->governor_data to NULL, and reading a file before being
> > >   destroyed.
> > > 

[cut]

> 
> So, this goes away with your patch (that I forward ported) and a small
> additional fix on top of that.

Which patch exactly is that?

Thanks,
Rafael



Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Juri Lelli
On 26/01/16 09:57, Juri Lelli wrote:
> Hi Viresh,
> 
> On 25/01/16 22:33, Viresh Kumar wrote:
> > There is a little race discovered by Juri, where we are able to:
> > - create and read a sysfs file before policy->governor_data is being set
> >   to a non NULL value.
> >   OR
> > - set policy->governor_data to NULL, and reading a file before being
> >   destroyed.
> > 
> > And so such a crash is reported:
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address 000c
> > pgd = edfc8000
> > [000c] *pgd=bfc8c835
> > Internal error: Oops: 17 [#1] SMP ARM
> > Modules linked in:
> > CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
> > Hardware name: ARM-Versatile Express
> > task: ee8e8480 ti: ee93 task.ti: ee93
> > PC is at show_ignore_nice_load_gov_pol+0x24/0x34
> > LR is at show+0x4c/0x60
> > pc : []lr : []psr: a0070013
> > sp : ee931dd0  ip : ee931de0  fp : ee931ddc
> > r10: ee4bc290  r9 : 1000  r8 : ef2cb000
> > r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
> > r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
> > Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > Control: 10c5387d  Table: adfc806a  DAC: 0051
> > Process cat (pid: 1730, stack limit = 0xee930210)
> > Stack: (0xee931dd0 to 0xee932000)
> > 1dc0: ee931dfc ee931de0 c058ae88 
> > c058f1a4
> > 1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 
> > c058ae48
> > 1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 
> > ee931e28
> > 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c 
> > ee931e48
> > 1e40:  edce3bf0 befe4a00 ee931f78   01e4 
> > 
> > 1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  
> > 1000
> > 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
> > 
> > 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 
> > ee931f78
> > 1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
> > 
> > 1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac 
> > ee931f08
> > 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 
> > ee8ff480
> > 1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 
> > ee931f48
> > 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 
> > befe4a00
> > 1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
> > 
> > 1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  
> > ee931fa8
> > 1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address 000c
> > 1fc0: 000a9f20 1000 befe4a00 0003   0003 
> > 0001
> > pgd = edfc4000
> > [000c] *pgd=bfcac835
> > 1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 
> > 134ac2c9
> > 
> > [] (show_ignore_nice_load_gov_pol) from [] 
> > (show+0x4c/0x60)
> > [] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
> > [] (sysfs_kf_seq_show) from [] 
> > (kernfs_seq_show+0x34/0x38)
> > [] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
> > [] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
> > [] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
> > [] (__vfs_read) from [] (vfs_read+0x98/0x104)
> > [] (vfs_read) from [] (SyS_read+0x50/0x90)
> > [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
> > Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
> > ---[ end trace 5994b9a5111f35ee ]---
> > 
> > Fix that by making sure, policy->governor_data is updated at the right
> > places only.
> > 
> 
> This patch fixes the crash I was seeing.
> 
> Tested-by: Juri Lelli 
> 
> However, it exposes another problem (running the concurrent lockdep test
> that you merged in your tests). After the test is finished there is
> always at least one task spinning. Do you think it might be related to
> the race we are already discussing in the thread related to my cleanups
> patches? This is what I see:
> 
> [   37.963599] ==
> [   37.982113] [ INFO: possible circular locking dependency detected ]
> [   38.000890] 4.5.0-rc1+ #468 Not tainted
> [   38.012368] ---
> [   38.031137] runme.sh/1710 is trying to acquire lock:
> [   38.045999]  (s_active#41){.+}, at: [] 
> kernfs_remove_by_name_ns+0x4c/0x94
> [   38.070063]
> [   38.070063] but task is already holding lock:
> [   38.087530]  (od_dbs_cdata.mutex){+.+.+.}, at: [] 
> cpufreq_governor_dbs+0x34/0x5d0
> [   38.112615]
> [   38.112615] which lock already depends on the new lock.
> [   38.112615]
> [   38.137114]
> [   38.137114] the existing dependency chain (in reverse order) is:
> [   38.159528]
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
> [   38.173664][] mutex_lock_nested+0x7c/0x420
> [   

Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Juri Lelli
Hi Viresh,

On 25/01/16 22:33, Viresh Kumar wrote:
> There is a little race discovered by Juri, where we are able to:
> - create and read a sysfs file before policy->governor_data is being set
>   to a non NULL value.
>   OR
> - set policy->governor_data to NULL, and reading a file before being
>   destroyed.
> 
> And so such a crash is reported:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> pgd = edfc8000
> [000c] *pgd=bfc8c835
> Internal error: Oops: 17 [#1] SMP ARM
> Modules linked in:
> CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
> Hardware name: ARM-Versatile Express
> task: ee8e8480 ti: ee93 task.ti: ee93
> PC is at show_ignore_nice_load_gov_pol+0x24/0x34
> LR is at show+0x4c/0x60
> pc : []lr : []psr: a0070013
> sp : ee931dd0  ip : ee931de0  fp : ee931ddc
> r10: ee4bc290  r9 : 1000  r8 : ef2cb000
> r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
> r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: adfc806a  DAC: 0051
> Process cat (pid: 1730, stack limit = 0xee930210)
> Stack: (0xee931dd0 to 0xee932000)
> 1dc0: ee931dfc ee931de0 c058ae88 c058f1a4
> 1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 c058ae48
> 1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 ee931e28
> 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
> 1e40:  edce3bf0 befe4a00 ee931f78   01e4 
> 1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  1000
> 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
> 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 ee931f78
> 1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
> 1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
> 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 ee8ff480
> 1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 ee931f48
> 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 befe4a00
> 1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
> 1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  ee931fa8
> 1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> 1fc0: 000a9f20 1000 befe4a00 0003   0003 0001
> pgd = edfc4000
> [000c] *pgd=bfcac835
> 1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 134ac2c9
> 
> [] (show_ignore_nice_load_gov_pol) from [] 
> (show+0x4c/0x60)
> [] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
> [] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x34/0x38)
> [] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
> [] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
> [] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
> [] (__vfs_read) from [] (vfs_read+0x98/0x104)
> [] (vfs_read) from [] (SyS_read+0x50/0x90)
> [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
> Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
> ---[ end trace 5994b9a5111f35ee ]---
> 
> Fix that by making sure, policy->governor_data is updated at the right
> places only.
> 

This patch fixes the crash I was seeing.

Tested-by: Juri Lelli 

However, it exposes another problem (running the concurrent lockdep test
that you merged in your tests). After the test is finished there is
always at least one task spinning. Do you think it might be related to
the race we are already discussing in the thread related to my cleanups
patches? This is what I see:

[   37.963599] ==
[   37.982113] [ INFO: possible circular locking dependency detected ]
[   38.000890] 4.5.0-rc1+ #468 Not tainted
[   38.012368] ---
[   38.031137] runme.sh/1710 is trying to acquire lock:
[   38.045999]  (s_active#41){.+}, at: [] 
kernfs_remove_by_name_ns+0x4c/0x94
[   38.070063]
[   38.070063] but task is already holding lock:
[   38.087530]  (od_dbs_cdata.mutex){+.+.+.}, at: [] 
cpufreq_governor_dbs+0x34/0x5d0
[   38.112615]
[   38.112615] which lock already depends on the new lock.
[   38.112615]
[   38.137114]
[   38.137114] the existing dependency chain (in reverse order) is:
[   38.159528]
-> #2 (od_dbs_cdata.mutex){+.+.+.}:
[   38.173664][] mutex_lock_nested+0x7c/0x420
[   38.190637][] cpufreq_governor_dbs+0x34/0x5d0
[   38.208380][] od_cpufreq_governor_dbs+0x20/0x28
[   38.226641][] __cpufreq_governor+0x98/0x1bc
[   38.243861][] cpufreq_set_policy+0x150/0x204
[   38.261341][] store_scaling_governor+0x70/0x8c
[   38.279343][] store+0x88/0xa4
[   38.292917]

Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Juri Lelli
Hi Viresh,

On 25/01/16 22:33, Viresh Kumar wrote:
> There is a little race discovered by Juri, where we are able to:
> - create and read a sysfs file before policy->governor_data is being set
>   to a non NULL value.
>   OR
> - set policy->governor_data to NULL, and reading a file before being
>   destroyed.
> 
> And so such a crash is reported:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> pgd = edfc8000
> [000c] *pgd=bfc8c835
> Internal error: Oops: 17 [#1] SMP ARM
> Modules linked in:
> CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
> Hardware name: ARM-Versatile Express
> task: ee8e8480 ti: ee93 task.ti: ee93
> PC is at show_ignore_nice_load_gov_pol+0x24/0x34
> LR is at show+0x4c/0x60
> pc : []lr : []psr: a0070013
> sp : ee931dd0  ip : ee931de0  fp : ee931ddc
> r10: ee4bc290  r9 : 1000  r8 : ef2cb000
> r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
> r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: adfc806a  DAC: 0051
> Process cat (pid: 1730, stack limit = 0xee930210)
> Stack: (0xee931dd0 to 0xee932000)
> 1dc0: ee931dfc ee931de0 c058ae88 c058f1a4
> 1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 c058ae48
> 1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 ee931e28
> 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
> 1e40:  edce3bf0 befe4a00 ee931f78   01e4 
> 1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  1000
> 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
> 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 ee931f78
> 1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
> 1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
> 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 ee8ff480
> 1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 ee931f48
> 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 befe4a00
> 1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
> 1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  ee931fa8
> 1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
> 1fc0: 000a9f20 1000 befe4a00 0003   0003 0001
> pgd = edfc4000
> [000c] *pgd=bfcac835
> 1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 134ac2c9
> 
> [] (show_ignore_nice_load_gov_pol) from [] 
> (show+0x4c/0x60)
> [] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
> [] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x34/0x38)
> [] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
> [] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
> [] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
> [] (__vfs_read) from [] (vfs_read+0x98/0x104)
> [] (vfs_read) from [] (SyS_read+0x50/0x90)
> [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
> Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
> ---[ end trace 5994b9a5111f35ee ]---
> 
> Fix that by making sure, policy->governor_data is updated at the right
> places only.
> 

This patch fixes the crash I was seeing.

Tested-by: Juri Lelli 

However, it exposes another problem (running the concurrent lockdep test
that you merged in your tests). After the test is finished there is
always at least one task spinning. Do you think it might be related to
the race we are already discussing in the thread related to my cleanups
patches? This is what I see:

[   37.963599] ==
[   37.982113] [ INFO: possible circular locking dependency detected ]
[   38.000890] 4.5.0-rc1+ #468 Not tainted
[   38.012368] ---
[   38.031137] runme.sh/1710 is trying to acquire lock:
[   38.045999]  (s_active#41){.+}, at: [] 
kernfs_remove_by_name_ns+0x4c/0x94
[   38.070063]
[   38.070063] but task is already holding lock:
[   38.087530]  (od_dbs_cdata.mutex){+.+.+.}, at: [] 
cpufreq_governor_dbs+0x34/0x5d0
[   38.112615]
[   38.112615] which lock already depends on the new lock.
[   38.112615]
[   38.137114]
[   38.137114] the existing dependency chain (in reverse order) is:
[   38.159528]
-> #2 (od_dbs_cdata.mutex){+.+.+.}:
[   38.173664][] mutex_lock_nested+0x7c/0x420
[   38.190637][] cpufreq_governor_dbs+0x34/0x5d0
[   38.208380][] od_cpufreq_governor_dbs+0x20/0x28
[   38.226641][] __cpufreq_governor+0x98/0x1bc
[   38.243861][] cpufreq_set_policy+0x150/0x204
[   38.261341][] store_scaling_governor+0x70/0x8c
[   38.279343][] 

Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Juri Lelli
On 26/01/16 09:57, Juri Lelli wrote:
> Hi Viresh,
> 
> On 25/01/16 22:33, Viresh Kumar wrote:
> > There is a little race discovered by Juri, where we are able to:
> > - create and read a sysfs file before policy->governor_data is being set
> >   to a non NULL value.
> >   OR
> > - set policy->governor_data to NULL, and reading a file before being
> >   destroyed.
> > 
> > And so such a crash is reported:
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address 000c
> > pgd = edfc8000
> > [000c] *pgd=bfc8c835
> > Internal error: Oops: 17 [#1] SMP ARM
> > Modules linked in:
> > CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
> > Hardware name: ARM-Versatile Express
> > task: ee8e8480 ti: ee93 task.ti: ee93
> > PC is at show_ignore_nice_load_gov_pol+0x24/0x34
> > LR is at show+0x4c/0x60
> > pc : []lr : []psr: a0070013
> > sp : ee931dd0  ip : ee931de0  fp : ee931ddc
> > r10: ee4bc290  r9 : 1000  r8 : ef2cb000
> > r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
> > r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
> > Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > Control: 10c5387d  Table: adfc806a  DAC: 0051
> > Process cat (pid: 1730, stack limit = 0xee930210)
> > Stack: (0xee931dd0 to 0xee932000)
> > 1dc0: ee931dfc ee931de0 c058ae88 
> > c058f1a4
> > 1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 
> > c058ae48
> > 1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 
> > ee931e28
> > 1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c 
> > ee931e48
> > 1e40:  edce3bf0 befe4a00 ee931f78   01e4 
> > 
> > 1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  
> > 1000
> > 1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
> > 
> > 1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 
> > ee931f78
> > 1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
> > 
> > 1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac 
> > ee931f08
> > 1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 
> > ee8ff480
> > 1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 
> > ee931f48
> > 1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 
> > befe4a00
> > 1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
> > 
> > 1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  
> > ee931fa8
> > 1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address 000c
> > 1fc0: 000a9f20 1000 befe4a00 0003   0003 
> > 0001
> > pgd = edfc4000
> > [000c] *pgd=bfcac835
> > 1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 
> > 134ac2c9
> > 
> > [] (show_ignore_nice_load_gov_pol) from [] 
> > (show+0x4c/0x60)
> > [] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
> > [] (sysfs_kf_seq_show) from [] 
> > (kernfs_seq_show+0x34/0x38)
> > [] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
> > [] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
> > [] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
> > [] (__vfs_read) from [] (vfs_read+0x98/0x104)
> > [] (vfs_read) from [] (SyS_read+0x50/0x90)
> > [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
> > Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
> > ---[ end trace 5994b9a5111f35ee ]---
> > 
> > Fix that by making sure, policy->governor_data is updated at the right
> > places only.
> > 
> 
> This patch fixes the crash I was seeing.
> 
> Tested-by: Juri Lelli 
> 
> However, it exposes another problem (running the concurrent lockdep test
> that you merged in your tests). After the test is finished there is
> always at least one task spinning. Do you think it might be related to
> the race we are already discussing in the thread related to my cleanups
> patches? This is what I see:
> 
> [   37.963599] ==
> [   37.982113] [ INFO: possible circular locking dependency detected ]
> [   38.000890] 4.5.0-rc1+ #468 Not tainted
> [   38.012368] ---
> [   38.031137] runme.sh/1710 is trying to acquire lock:
> [   38.045999]  (s_active#41){.+}, at: [] 
> kernfs_remove_by_name_ns+0x4c/0x94
> [   38.070063]
> [   38.070063] but task is already holding lock:
> [   38.087530]  (od_dbs_cdata.mutex){+.+.+.}, at: [] 
> cpufreq_governor_dbs+0x34/0x5d0
> [   38.112615]
> [   38.112615] which lock already depends on the new lock.
> [   38.112615]
> [   38.137114]
> [   38.137114] the existing dependency chain (in reverse order) is:
> [   38.159528]
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
> [   38.173664][] 

Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Rafael J. Wysocki
On Tuesday, January 26, 2016 06:01:19 PM Juri Lelli wrote:
> On 26/01/16 09:57, Juri Lelli wrote:
> > Hi Viresh,
> > 
> > On 25/01/16 22:33, Viresh Kumar wrote:
> > > There is a little race discovered by Juri, where we are able to:
> > > - create and read a sysfs file before policy->governor_data is being set
> > >   to a non NULL value.
> > >   OR
> > > - set policy->governor_data to NULL, and reading a file before being
> > >   destroyed.
> > > 

[cut]

> 
> So, this goes away with your patch (that I forward ported) and a small
> additional fix on top of that.

Which patch exactly is that?

Thanks,
Rafael



Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Viresh Kumar
On 26-01-16, 09:57, Juri Lelli wrote:
> This patch fixes the crash I was seeing.
> 
> Tested-by: Juri Lelli 

Thanks.

> However, it exposes another problem (running the concurrent lockdep test

It exposes? How can this patch expose the below crash. AFAIR, you
reported that you are getting below crash on plain mainline on TC2,
i.e. for drivers with policy-per-governor set.

The reason is obvious, as the governor's sysfs directory is present
cpus/cpuX/cpufreq/ instead of cpus/cpufreq/, which used to be the case
without the flag. And this forces the show()/store() present in
cpufreq.c to be called which also take policy->rwsem.

> that you merged in your tests). After the test is finished there is
> always at least one task spinning. Do you think it might be related to
> the race we are already discussing in the thread related to my cleanups
> patches? This is what I see:

So this is what you reported earlier, right?

> [   38.843648] other info that might help us debug this:
> [   38.843648]
> [   38.867627] Chain exists of:
>   s_active#41 --> >rwsem --> od_dbs_cdata.mutex
> 
> [   38.891693]  Possible unsafe locking scenario:
> [   38.891693]

Will elaborate it a bit here..
- CPU0 is calling governor's EXIT()
- CPU1 is reading a governor file from sysfs

> [   38.909419]CPU0CPU1
> [   38.922978]

Following needs to be added here..

   EXIT-governorread/write governor file

lock(s_active#41);

> [   38.936535]   lock(od_dbs_cdata.mutex);
> [   38.948146]lock(>rwsem);
> [   38.966168]lock(od_dbs_cdata.mutex);
> [   38.985219]   lock(s_active#41);
> [   38.994923]
> [   38.994923]  *** DEADLOCK ***

> Now, you already pointed me at a possible fix. I'm going to test that
> (even if I have questions about that patch :)) and see if it makes this
> go away. 

@Rafael: Juri is talking about this patch:

http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93

Juri, I thought it will fix it earlier (when I wrote it), but it never
did on x86 (while I dropped the rwsem-drop-code around EXIT as well).

And I never came back to it and so never sent it upstream.

-- 
viresh


Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-26 Thread Viresh Kumar
On 26-01-16, 18:01, Juri Lelli wrote:
> So, this goes away with your patch (that I forward ported) and a small
> additional fix on top of that. I pushed all that here (so that it is
> also tested by 0-day):

I am surprised :)

>  git://linux-arm.org/linux-jl.git fixes/cpufreq/policy_exit_race 
> 
> However, I can still see what below :/.

That's bad :(

-- 
viresh


[PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-25 Thread Viresh Kumar
There is a little race discovered by Juri, where we are able to:
- create and read a sysfs file before policy->governor_data is being set
  to a non NULL value.
  OR
- set policy->governor_data to NULL, and reading a file before being
  destroyed.

And so such a crash is reported:

Unable to handle kernel NULL pointer dereference at virtual address 000c
pgd = edfc8000
[000c] *pgd=bfc8c835
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
Hardware name: ARM-Versatile Express
task: ee8e8480 ti: ee93 task.ti: ee93
PC is at show_ignore_nice_load_gov_pol+0x24/0x34
LR is at show+0x4c/0x60
pc : []lr : []psr: a0070013
sp : ee931dd0  ip : ee931de0  fp : ee931ddc
r10: ee4bc290  r9 : 1000  r8 : ef2cb000
r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: adfc806a  DAC: 0051
Process cat (pid: 1730, stack limit = 0xee930210)
Stack: (0xee931dd0 to 0xee932000)
1dc0: ee931dfc ee931de0 c058ae88 c058f1a4
1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 c058ae48
1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 ee931e28
1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
1e40:  edce3bf0 befe4a00 ee931f78   01e4 
1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  1000
1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 ee931f78
1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 ee8ff480
1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 ee931f48
1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 befe4a00
1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  ee931fa8
1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
Unable to handle kernel NULL pointer dereference at virtual address 000c
1fc0: 000a9f20 1000 befe4a00 0003   0003 0001
pgd = edfc4000
[000c] *pgd=bfcac835
1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 134ac2c9

[] (show_ignore_nice_load_gov_pol) from [] (show+0x4c/0x60)
[] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
[] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x34/0x38)
[] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
[] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
[] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
[] (__vfs_read) from [] (vfs_read+0x98/0x104)
[] (vfs_read) from [] (SyS_read+0x50/0x90)
[] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
---[ end trace 5994b9a5111f35ee ]---

Fix that by making sure, policy->governor_data is updated at the right
places only.

Cc:  # v4.2+
Reported-by: Juri Lelli 
Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/cpufreq_governor.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index bab3a514ec12..e0d111024d48 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -387,16 +387,18 @@ static int cpufreq_governor_init(struct cpufreq_policy 
*policy,
if (!have_governor_per_policy())
cdata->gdbs_data = dbs_data;
 
+   policy->governor_data = dbs_data;
+
ret = sysfs_create_group(get_governor_parent_kobj(policy),
 get_sysfs_attr(dbs_data));
if (ret)
goto reset_gdbs_data;
 
-   policy->governor_data = dbs_data;
-
return 0;
 
 reset_gdbs_data:
+   policy->governor_data = NULL;
+
if (!have_governor_per_policy())
cdata->gdbs_data = NULL;
cdata->exit(dbs_data, !policy->governor->initialized);
@@ -417,16 +419,19 @@ static int cpufreq_governor_exit(struct cpufreq_policy 
*policy,
if (!cdbs->shared || cdbs->shared->policy)
return -EBUSY;
 
-   policy->governor_data = NULL;
if (!--dbs_data->usage_count) {
sysfs_remove_group(get_governor_parent_kobj(policy),
   get_sysfs_attr(dbs_data));
 
+   policy->governor_data = NULL;
+
if (!have_governor_per_policy())
cdata->gdbs_data = NULL;
 
cdata->exit(dbs_data, policy->governor->initialized == 1);
kfree(dbs_data);
+   } else {
+   

[PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

2016-01-25 Thread Viresh Kumar
There is a little race discovered by Juri, where we are able to:
- create and read a sysfs file before policy->governor_data is being set
  to a non NULL value.
  OR
- set policy->governor_data to NULL, and reading a file before being
  destroyed.

And so such a crash is reported:

Unable to handle kernel NULL pointer dereference at virtual address 000c
pgd = edfc8000
[000c] *pgd=bfc8c835
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
Hardware name: ARM-Versatile Express
task: ee8e8480 ti: ee93 task.ti: ee93
PC is at show_ignore_nice_load_gov_pol+0x24/0x34
LR is at show+0x4c/0x60
pc : []lr : []psr: a0070013
sp : ee931dd0  ip : ee931de0  fp : ee931ddc
r10: ee4bc290  r9 : 1000  r8 : ef2cb000
r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
r3 :   r2 :   r1 : c0928df4  r0 : ef2cb000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: adfc806a  DAC: 0051
Process cat (pid: 1730, stack limit = 0xee930210)
Stack: (0xee931dd0 to 0xee932000)
1dc0: ee931dfc ee931de0 c058ae88 c058f1a4
1de0: edce3bc0 c07bfca4 edce3ac0 1000 ee931e24 ee931e00 c01fcb90 c058ae48
1e00: 0001 edce3bc0  0001 ee931e50 ee8ff480 ee931e34 ee931e28
1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
1e40:  edce3bf0 befe4a00 ee931f78   01e4 
1e60: c00545a8 edce3ac0 1000 1000 befe4a00 ee931f78  1000
1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 0002  
1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 1000 befe4a00 ee931f78
1ec0:  1000 ee931f44 ee931ed8 c017c328 c01fbdc4 1000 
1ee0: ee8ff480 1000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 0020 ee8ff480
1f20: ee8ff480 befe4a00 1000 ee931f78   ee931f74 ee931f48
1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 1000 befe4a00
1f60:   ee931fa4 ee931f78 c017d2a8 c017d160  
1f80: 000a9f20 1000 befe4a00 0003 c000ffe4 ee93  ee931fa8
1fa0: c000fe40 c017d264 000a9f20 1000 0003 befe4a00 1000 
Unable to handle kernel NULL pointer dereference at virtual address 000c
1fc0: 000a9f20 1000 befe4a00 0003   0003 0001
pgd = edfc4000
[000c] *pgd=bfcac835
1fe0:  befe49dc 000197f8 b6e35dfc 60070010 0003 3065b49d 134ac2c9

[] (show_ignore_nice_load_gov_pol) from [] (show+0x4c/0x60)
[] (show) from [] (sysfs_kf_seq_show+0x90/0xfc)
[] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x34/0x38)
[] (kernfs_seq_show) from [] (seq_read+0x1e4/0x4e4)
[] (seq_read) from [] (kernfs_fop_read+0x120/0x1a0)
[] (kernfs_fop_read) from [] (__vfs_read+0x3c/0xe0)
[] (__vfs_read) from [] (vfs_read+0x98/0x104)
[] (vfs_read) from [] (SyS_read+0x50/0x90)
[] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c)
Code: e5903044 e1a1 e3081df4 e34c1092 (e593300c)
---[ end trace 5994b9a5111f35ee ]---

Fix that by making sure, policy->governor_data is updated at the right
places only.

Cc:  # v4.2+
Reported-by: Juri Lelli 
Signed-off-by: Viresh Kumar 
---
 drivers/cpufreq/cpufreq_governor.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c 
b/drivers/cpufreq/cpufreq_governor.c
index bab3a514ec12..e0d111024d48 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -387,16 +387,18 @@ static int cpufreq_governor_init(struct cpufreq_policy 
*policy,
if (!have_governor_per_policy())
cdata->gdbs_data = dbs_data;
 
+   policy->governor_data = dbs_data;
+
ret = sysfs_create_group(get_governor_parent_kobj(policy),
 get_sysfs_attr(dbs_data));
if (ret)
goto reset_gdbs_data;
 
-   policy->governor_data = dbs_data;
-
return 0;
 
 reset_gdbs_data:
+   policy->governor_data = NULL;
+
if (!have_governor_per_policy())
cdata->gdbs_data = NULL;
cdata->exit(dbs_data, !policy->governor->initialized);
@@ -417,16 +419,19 @@ static int cpufreq_governor_exit(struct cpufreq_policy 
*policy,
if (!cdbs->shared || cdbs->shared->policy)
return -EBUSY;
 
-   policy->governor_data = NULL;
if (!--dbs_data->usage_count) {
sysfs_remove_group(get_governor_parent_kobj(policy),
   get_sysfs_attr(dbs_data));
 
+   policy->governor_data = NULL;
+
if (!have_governor_per_policy())
cdata->gdbs_data = NULL;
 
cdata->exit(dbs_data, policy->governor->initialized == 1);