subject:"\[PATCH v2\] cpufreq\: Don't destroy\/realloc policy\/sysfs on hotplug\/suspend"

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-16 Thread Srivatsa S. Bhat

On 07/16/2014 11:14 AM, Viresh Kumar wrote:
> On 15 July 2014 12:28, Srivatsa S. Bhat  wrote:
>> Wait, allowing an offline CPU to be the policy->cpu (i.e., the CPU which is
>> considered as the master of the policy/group) is just absurd.
> 
> Yeah, that was as Absurd as I am :)
> 

I have had my own share of silly ideas over the years; so don't worry, we are
all in the same boat ;-)

>> The goal of this patchset should be to just de-couple the sysfs 
>> files/ownership
>> from the policy->cpu to an extent where it doesn't matter who owns those
>> files, and probably make it easier to do CPU hotplug without having to
>> destroy and recreate the files on every hotplug operation.
> 
> I went to that Absurd idea because we thought we can skip playing with
> the sysfs nodes on suspend/hotplug.
> 
> And if policy->cpu keeps changing with hotplug, we *may* have to keep
> sysfs stuff moving as well. One way to avoid that is by using something
> like: policy->sysfs_cpu, but wasn't sure if that's the right path to follow.
>

Hmm, I understand.. Even I don't have any suggestions as of now, since I
haven't spent enough time thinking of alternatives yet.

> Lets see what Saravana's new patchset has for us :)
> 

Yep :-)

Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-16 Thread Srivatsa S. Bhat

On 07/15/2014 11:05 PM, skan...@codeaurora.org wrote:
> 
> Srivatsa S. Bhat wrote:
>> On 07/15/2014 11:06 AM, Saravana Kannan wrote:
>>> On 07/14/2014 09:35 PM, Viresh Kumar wrote:
 On 15 July 2014 00:38, Saravana Kannan  wrote:
> Yeah, it definitely crashes if policy->cpu if an offline cpu. Because
> the
> mutex would be uninitialized if it's stopped after boot or it would
> never
> have been initialized (depending on how you fix policy->cpu at boot).
>
> Look at this snippet on the actual tree and it should be pretty
> evident.

 Yeah, I missed it. So the problem is we initialize timer_mutex's for
 policy->cpus. So we need to do that just for policy->cpu and also we
 don't
 need a per-cpu timer_mutex anymore.

>>>
>>> Btw, I tried to take a stab at removing any assumption in cpufreq code
>>> about policy->cpu being ONLINE.
>>
>> Wait, allowing an offline CPU to be the policy->cpu (i.e., the CPU which
>> is
>> considered as the master of the policy/group) is just absurd. If there is
>> no leader, there is no army. We should NOT sacrifice sane semantics for
>> the
>> sake of simplifying the code.
>>
>>> There are 160 instances of those of with
>>> 23 are in cpufreq.c
>>>
>>
>> And that explains why. It is just *natural* to assume that the CPUs
>> governed
>> by a policy are online. Especially so for the CPU which is supposed to be
>> the policy leader. Let us please not change that - it will become
>> counter-intuitive if we do so. [ The other reason is that physical hotplug
>> is also possible on some systems... in that case your code might make a
>> CPU
>> which is not even present (but possible) as the policy->cpu.. and great
>> 'fun'
>> will ensue after that ;-( ]
>>
>> The goal of this patchset should be to just de-couple the sysfs
>> files/ownership
>> from the policy->cpu to an extent where it doesn't matter who owns those
>> files, and probably make it easier to do CPU hotplug without having to
>> destroy and recreate the files on every hotplug operation.
>>
>> This is exactly why the _implementation_ matters in this particular case -
>> if we can't achieve the simplification by keeping sane semantics, then we
>> shouldn't do the simplification!
>>
>> That said, I think we should keep trying - we haven't exhausted all ideas
>> yet :-)
>>
> 
> I don't think we disagree. To summarize this topic: I tried to keep the
> policy->cpu an actual online CPU so as to not break existing semantics in
> this patch. Viresh asked "why not fix it at boot?". My response was to
> keep it an online CPU and give it a shot in a separate patch if we really
> want that. It's too risky to do that in this patch and also not a
> mandatory change for this patch.
> 
> I think we can work out the details on the need to fixing policy->cpu at
> boot and whether there's even a need for policy->cpu (when we already have
> policy->cpus) in a separate thread after the dust settles on this one?
>

Sure, that sounds good!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-16 Thread Srivatsa S. Bhat

On 07/15/2014 11:05 PM, skan...@codeaurora.org wrote:
 
 Srivatsa S. Bhat wrote:
 On 07/15/2014 11:06 AM, Saravana Kannan wrote:
 On 07/14/2014 09:35 PM, Viresh Kumar wrote:
 On 15 July 2014 00:38, Saravana Kannan skan...@codeaurora.org wrote:
 Yeah, it definitely crashes if policy-cpu if an offline cpu. Because
 the
 mutex would be uninitialized if it's stopped after boot or it would
 never
 have been initialized (depending on how you fix policy-cpu at boot).

 Look at this snippet on the actual tree and it should be pretty
 evident.

 Yeah, I missed it. So the problem is we initialize timer_mutex's for
 policy-cpus. So we need to do that just for policy-cpu and also we
 don't
 need a per-cpu timer_mutex anymore.


 Btw, I tried to take a stab at removing any assumption in cpufreq code
 about policy-cpu being ONLINE.

 Wait, allowing an offline CPU to be the policy-cpu (i.e., the CPU which
 is
 considered as the master of the policy/group) is just absurd. If there is
 no leader, there is no army. We should NOT sacrifice sane semantics for
 the
 sake of simplifying the code.

 There are 160 instances of those of with
 23 are in cpufreq.c


 And that explains why. It is just *natural* to assume that the CPUs
 governed
 by a policy are online. Especially so for the CPU which is supposed to be
 the policy leader. Let us please not change that - it will become
 counter-intuitive if we do so. [ The other reason is that physical hotplug
 is also possible on some systems... in that case your code might make a
 CPU
 which is not even present (but possible) as the policy-cpu.. and great
 'fun'
 will ensue after that ;-( ]

 The goal of this patchset should be to just de-couple the sysfs
 files/ownership
 from the policy-cpu to an extent where it doesn't matter who owns those
 files, and probably make it easier to do CPU hotplug without having to
 destroy and recreate the files on every hotplug operation.

 This is exactly why the _implementation_ matters in this particular case -
 if we can't achieve the simplification by keeping sane semantics, then we
 shouldn't do the simplification!

 That said, I think we should keep trying - we haven't exhausted all ideas
 yet :-)

 
 I don't think we disagree. To summarize this topic: I tried to keep the
 policy-cpu an actual online CPU so as to not break existing semantics in
 this patch. Viresh asked why not fix it at boot?. My response was to
 keep it an online CPU and give it a shot in a separate patch if we really
 want that. It's too risky to do that in this patch and also not a
 mandatory change for this patch.
 
 I think we can work out the details on the need to fixing policy-cpu at
 boot and whether there's even a need for policy-cpu (when we already have
 policy-cpus) in a separate thread after the dust settles on this one?


Sure, that sounds good!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-16 Thread Srivatsa S. Bhat

On 07/16/2014 11:14 AM, Viresh Kumar wrote:
 On 15 July 2014 12:28, Srivatsa S. Bhat sriva...@mit.edu wrote:
 Wait, allowing an offline CPU to be the policy-cpu (i.e., the CPU which is
 considered as the master of the policy/group) is just absurd.
 
 Yeah, that was as Absurd as I am :)
 

I have had my own share of silly ideas over the years; so don't worry, we are
all in the same boat ;-)

 The goal of this patchset should be to just de-couple the sysfs 
 files/ownership
 from the policy-cpu to an extent where it doesn't matter who owns those
 files, and probably make it easier to do CPU hotplug without having to
 destroy and recreate the files on every hotplug operation.
 
 I went to that Absurd idea because we thought we can skip playing with
 the sysfs nodes on suspend/hotplug.
 
 And if policy-cpu keeps changing with hotplug, we *may* have to keep
 sysfs stuff moving as well. One way to avoid that is by using something
 like: policy-sysfs_cpu, but wasn't sure if that's the right path to follow.


Hmm, I understand.. Even I don't have any suggestions as of now, since I
haven't spent enough time thinking of alternatives yet.

 Lets see what Saravana's new patchset has for us :)
 

Yep :-)

Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-15 Thread Viresh Kumar

On 15 July 2014 12:28, Srivatsa S. Bhat  wrote:
> Wait, allowing an offline CPU to be the policy->cpu (i.e., the CPU which is
> considered as the master of the policy/group) is just absurd.

Yeah, that was as Absurd as I am :)

> The goal of this patchset should be to just de-couple the sysfs 
> files/ownership
> from the policy->cpu to an extent where it doesn't matter who owns those
> files, and probably make it easier to do CPU hotplug without having to
> destroy and recreate the files on every hotplug operation.

I went to that Absurd idea because we thought we can skip playing with
the sysfs nodes on suspend/hotplug.

And if policy->cpu keeps changing with hotplug, we *may* have to keep
sysfs stuff moving as well. One way to avoid that is by using something
like: policy->sysfs_cpu, but wasn't sure if that's the right path to follow.

Lets see what Saravana's new patchset has for us :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-15 Thread skannan


Srivatsa S. Bhat wrote:
> On 07/15/2014 11:06 AM, Saravana Kannan wrote:
>> On 07/14/2014 09:35 PM, Viresh Kumar wrote:
>>> On 15 July 2014 00:38, Saravana Kannan  wrote:
 Yeah, it definitely crashes if policy->cpu if an offline cpu. Because
 the
 mutex would be uninitialized if it's stopped after boot or it would
 never
 have been initialized (depending on how you fix policy->cpu at boot).

 Look at this snippet on the actual tree and it should be pretty
 evident.
>>>
>>> Yeah, I missed it. So the problem is we initialize timer_mutex's for
>>> policy->cpus. So we need to do that just for policy->cpu and also we
>>> don't
>>> need a per-cpu timer_mutex anymore.
>>>
>>
>> Btw, I tried to take a stab at removing any assumption in cpufreq code
>> about policy->cpu being ONLINE.
>
> Wait, allowing an offline CPU to be the policy->cpu (i.e., the CPU which
> is
> considered as the master of the policy/group) is just absurd. If there is
> no leader, there is no army. We should NOT sacrifice sane semantics for
> the
> sake of simplifying the code.
>
>> There are 160 instances of those of with
>> 23 are in cpufreq.c
>>
>
> And that explains why. It is just *natural* to assume that the CPUs
> governed
> by a policy are online. Especially so for the CPU which is supposed to be
> the policy leader. Let us please not change that - it will become
> counter-intuitive if we do so. [ The other reason is that physical hotplug
> is also possible on some systems... in that case your code might make a
> CPU
> which is not even present (but possible) as the policy->cpu.. and great
> 'fun'
> will ensue after that ;-( ]
>
> The goal of this patchset should be to just de-couple the sysfs
> files/ownership
> from the policy->cpu to an extent where it doesn't matter who owns those
> files, and probably make it easier to do CPU hotplug without having to
> destroy and recreate the files on every hotplug operation.
>
> This is exactly why the _implementation_ matters in this particular case -
> if we can't achieve the simplification by keeping sane semantics, then we
> shouldn't do the simplification!
>
> That said, I think we should keep trying - we haven't exhausted all ideas
> yet :-)
>

I don't think we disagree. To summarize this topic: I tried to keep the
policy->cpu an actual online CPU so as to not break existing semantics in
this patch. Viresh asked "why not fix it at boot?". My response was to
keep it an online CPU and give it a shot in a separate patch if we really
want that. It's too risky to do that in this patch and also not a
mandatory change for this patch.

I think we can work out the details on the need to fixing policy->cpu at
boot and whether there's even a need for policy->cpu (when we already have
policy->cpus) in a separate thread after the dust settles on this one?

-Saravana

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-15 Thread Srivatsa S. Bhat

On 07/15/2014 11:06 AM, Saravana Kannan wrote:
> On 07/14/2014 09:35 PM, Viresh Kumar wrote:
>> On 15 July 2014 00:38, Saravana Kannan  wrote:
>>> Yeah, it definitely crashes if policy->cpu if an offline cpu. Because
>>> the
>>> mutex would be uninitialized if it's stopped after boot or it would
>>> never
>>> have been initialized (depending on how you fix policy->cpu at boot).
>>>
>>> Look at this snippet on the actual tree and it should be pretty evident.
>>
>> Yeah, I missed it. So the problem is we initialize timer_mutex's for
>> policy->cpus. So we need to do that just for policy->cpu and also we
>> don't
>> need a per-cpu timer_mutex anymore.
>>
> 
> Btw, I tried to take a stab at removing any assumption in cpufreq code
> about policy->cpu being ONLINE.

Wait, allowing an offline CPU to be the policy->cpu (i.e., the CPU which is
considered as the master of the policy/group) is just absurd. If there is
no leader, there is no army. We should NOT sacrifice sane semantics for the
sake of simplifying the code.

> There are 160 instances of those of with
> 23 are in cpufreq.c
>

And that explains why. It is just *natural* to assume that the CPUs governed
by a policy are online. Especially so for the CPU which is supposed to be
the policy leader. Let us please not change that - it will become
counter-intuitive if we do so. [ The other reason is that physical hotplug
is also possible on some systems... in that case your code might make a CPU
which is not even present (but possible) as the policy->cpu.. and great 'fun'
will ensue after that ;-( ]

The goal of this patchset should be to just de-couple the sysfs files/ownership
from the policy->cpu to an extent where it doesn't matter who owns those
files, and probably make it easier to do CPU hotplug without having to
destroy and recreate the files on every hotplug operation.

This is exactly why the _implementation_ matters in this particular case -
if we can't achieve the simplification by keeping sane semantics, then we
shouldn't do the simplification!

That said, I think we should keep trying - we haven't exhausted all ideas
yet :-)

Regards,
Srivatsa S. Bhat

> So, even if we are sure cpufreq.c is fine, it's 137 other uses spread
> across all the other files. I definitely don't want to try and fix those
> as part of this patch. Way too risky and hard to get the test coverage
> it would need. Even some of the acpi cpufreq drivers seem to be making
> this assumption.
> 
> Btw, I think v3 is done. I did some testing and it was fine. But made
> some minor changes. Will test tomorrow to make sure I didn't break
> anything with the minor changes and then send them out.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-15 Thread Srivatsa S. Bhat

On 07/15/2014 11:06 AM, Saravana Kannan wrote:
 On 07/14/2014 09:35 PM, Viresh Kumar wrote:
 On 15 July 2014 00:38, Saravana Kannan skan...@codeaurora.org wrote:
 Yeah, it definitely crashes if policy-cpu if an offline cpu. Because
 the
 mutex would be uninitialized if it's stopped after boot or it would
 never
 have been initialized (depending on how you fix policy-cpu at boot).

 Look at this snippet on the actual tree and it should be pretty evident.

 Yeah, I missed it. So the problem is we initialize timer_mutex's for
 policy-cpus. So we need to do that just for policy-cpu and also we
 don't
 need a per-cpu timer_mutex anymore.

 
 Btw, I tried to take a stab at removing any assumption in cpufreq code
 about policy-cpu being ONLINE.

Wait, allowing an offline CPU to be the policy-cpu (i.e., the CPU which is
considered as the master of the policy/group) is just absurd. If there is
no leader, there is no army. We should NOT sacrifice sane semantics for the
sake of simplifying the code.

 There are 160 instances of those of with
 23 are in cpufreq.c


And that explains why. It is just *natural* to assume that the CPUs governed
by a policy are online. Especially so for the CPU which is supposed to be
the policy leader. Let us please not change that - it will become
counter-intuitive if we do so. [ The other reason is that physical hotplug
is also possible on some systems... in that case your code might make a CPU
which is not even present (but possible) as the policy-cpu.. and great 'fun'
will ensue after that ;-( ]

The goal of this patchset should be to just de-couple the sysfs files/ownership
from the policy-cpu to an extent where it doesn't matter who owns those
files, and probably make it easier to do CPU hotplug without having to
destroy and recreate the files on every hotplug operation.

This is exactly why the _implementation_ matters in this particular case -
if we can't achieve the simplification by keeping sane semantics, then we
shouldn't do the simplification!

That said, I think we should keep trying - we haven't exhausted all ideas
yet :-)

Regards,
Srivatsa S. Bhat

 So, even if we are sure cpufreq.c is fine, it's 137 other uses spread
 across all the other files. I definitely don't want to try and fix those
 as part of this patch. Way too risky and hard to get the test coverage
 it would need. Even some of the acpi cpufreq drivers seem to be making
 this assumption.
 
 Btw, I think v3 is done. I did some testing and it was fine. But made
 some minor changes. Will test tomorrow to make sure I didn't break
 anything with the minor changes and then send them out.
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-15 Thread skannan


Srivatsa S. Bhat wrote:
 On 07/15/2014 11:06 AM, Saravana Kannan wrote:
 On 07/14/2014 09:35 PM, Viresh Kumar wrote:
 On 15 July 2014 00:38, Saravana Kannan skan...@codeaurora.org wrote:
 Yeah, it definitely crashes if policy-cpu if an offline cpu. Because
 the
 mutex would be uninitialized if it's stopped after boot or it would
 never
 have been initialized (depending on how you fix policy-cpu at boot).

 Look at this snippet on the actual tree and it should be pretty
 evident.

 Yeah, I missed it. So the problem is we initialize timer_mutex's for
 policy-cpus. So we need to do that just for policy-cpu and also we
 don't
 need a per-cpu timer_mutex anymore.


 Btw, I tried to take a stab at removing any assumption in cpufreq code
 about policy-cpu being ONLINE.

 Wait, allowing an offline CPU to be the policy-cpu (i.e., the CPU which
 is
 considered as the master of the policy/group) is just absurd. If there is
 no leader, there is no army. We should NOT sacrifice sane semantics for
 the
 sake of simplifying the code.

 There are 160 instances of those of with
 23 are in cpufreq.c


 And that explains why. It is just *natural* to assume that the CPUs
 governed
 by a policy are online. Especially so for the CPU which is supposed to be
 the policy leader. Let us please not change that - it will become
 counter-intuitive if we do so. [ The other reason is that physical hotplug
 is also possible on some systems... in that case your code might make a
 CPU
 which is not even present (but possible) as the policy-cpu.. and great
 'fun'
 will ensue after that ;-( ]

 The goal of this patchset should be to just de-couple the sysfs
 files/ownership
 from the policy-cpu to an extent where it doesn't matter who owns those
 files, and probably make it easier to do CPU hotplug without having to
 destroy and recreate the files on every hotplug operation.

 This is exactly why the _implementation_ matters in this particular case -
 if we can't achieve the simplification by keeping sane semantics, then we
 shouldn't do the simplification!

 That said, I think we should keep trying - we haven't exhausted all ideas
 yet :-)


I don't think we disagree. To summarize this topic: I tried to keep the
policy-cpu an actual online CPU so as to not break existing semantics in
this patch. Viresh asked why not fix it at boot?. My response was to
keep it an online CPU and give it a shot in a separate patch if we really
want that. It's too risky to do that in this patch and also not a
mandatory change for this patch.

I think we can work out the details on the need to fixing policy-cpu at
boot and whether there's even a need for policy-cpu (when we already have
policy-cpus) in a separate thread after the dust settles on this one?

-Saravana

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-15 Thread Viresh Kumar

On 15 July 2014 12:28, Srivatsa S. Bhat sriva...@mit.edu wrote:
 Wait, allowing an offline CPU to be the policy-cpu (i.e., the CPU which is
 considered as the master of the policy/group) is just absurd.

Yeah, that was as Absurd as I am :)

 The goal of this patchset should be to just de-couple the sysfs 
 files/ownership
 from the policy-cpu to an extent where it doesn't matter who owns those
 files, and probably make it easier to do CPU hotplug without having to
 destroy and recreate the files on every hotplug operation.

I went to that Absurd idea because we thought we can skip playing with
the sysfs nodes on suspend/hotplug.

And if policy-cpu keeps changing with hotplug, we *may* have to keep
sysfs stuff moving as well. One way to avoid that is by using something
like: policy-sysfs_cpu, but wasn't sure if that's the right path to follow.

Lets see what Saravana's new patchset has for us :)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 15 July 2014 11:06, Saravana Kannan  wrote:
> Btw, I tried to take a stab at removing any assumption in cpufreq code about
> policy->cpu being ONLINE. There are 160 instances of those of with 23 are in
> cpufreq.c
>
> So, even if we are sure cpufreq.c is fine, it's 137 other uses spread across
> all the other files. I definitely don't want to try and fix those as part of
> this patch. Way too risky and hard to get the test coverage it would need.
> Even some of the acpi cpufreq drivers seem to be making this assumption.

Hmm, yeah that would be an issue. So this is what you should do now:
- Left policy->cpu as it is, i.e. updated only when policy->cpu goes down.
- Just make sure sysfs nodes are untouched when any cpu goes down

> Btw, I think v3 is done. I did some testing and it was fine. But made some
> minor changes. Will test tomorrow to make sure I didn't break anything with
> the minor changes and then send them out.

Ok, just comply to the above comments.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Saravana Kannan


On 07/14/2014 09:35 PM, Viresh Kumar wrote:

On 15 July 2014 00:38, Saravana Kannan  wrote:

Yeah, it definitely crashes if policy->cpu if an offline cpu. Because the
mutex would be uninitialized if it's stopped after boot or it would never
have been initialized (depending on how you fix policy->cpu at boot).

Look at this snippet on the actual tree and it should be pretty evident.


Yeah, I missed it. So the problem is we initialize timer_mutex's for
policy->cpus. So we need to do that just for policy->cpu and also we don't
need a per-cpu timer_mutex anymore.



Btw, I tried to take a stab at removing any assumption in cpufreq code 
about policy->cpu being ONLINE. There are 160 instances of those of with 
23 are in cpufreq.c


So, even if we are sure cpufreq.c is fine, it's 137 other uses spread 
across all the other files. I definitely don't want to try and fix those 
as part of this patch. Way too risky and hard to get the test coverage 
it would need. Even some of the acpi cpufreq drivers seem to be making 
this assumption.


Btw, I think v3 is done. I did some testing and it was fine. But made 
some minor changes. Will test tomorrow to make sure I didn't break 
anything with the minor changes and then send them out.


-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 15 July 2014 00:38, Saravana Kannan  wrote:
> Yeah, it definitely crashes if policy->cpu if an offline cpu. Because the
> mutex would be uninitialized if it's stopped after boot or it would never
> have been initialized (depending on how you fix policy->cpu at boot).
>
> Look at this snippet on the actual tree and it should be pretty evident.

Yeah, I missed it. So the problem is we initialize timer_mutex's for
policy->cpus. So we need to do that just for policy->cpu and also we don't
need a per-cpu timer_mutex anymore.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Saravana Kannan


On 07/13/2014 11:13 PM, Viresh Kumar wrote:

On 12 July 2014 08:36, Saravana Kannan  wrote:

On 07/10/2014 11:19 PM, Viresh Kumar wrote:



Please make sure you take care of these issues:
- suspend/resume
- hotplug
- module insert/remove


Ok, I was just at the current code. Does cpufreq_unregister_driver() even
really work correctly as it stands?

It doesn't even seem to stop any of the governors/policies before it just
set the cpufreq_driver pointer to NULL.

So, technically my v2 patch doesn't even make anything worse when it comes
to unregistering the cpufreq driver.


Are you really sure about this? I have tested this *myself* earlier..

subsys_interface_unregister() should take care of stopping/freeing governor
stuff..



I was asking this question based on looking at the code. Didn't actually 
try it -- sent it just before being done for the day. I didn't know 
about the subsys_interface_unregister() coming into play here. I'll take 
a look.


Thanks for the pointer.

-Saravana


--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Saravana Kannan


On 07/13/2014 11:09 PM, Viresh Kumar wrote:

On 12 July 2014 08:14, Saravana Kannan  wrote:


I'm just always adding the real nodes to the first CPU in a cluster
independent of which CPU gets added first. Makes it easier to know which
ones to symlink. See comment next to policy->cpu for full context.



Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy->cpu the right CPU always?



No, the "first" cpu in a cluster doesn't need to be the first one to be
added. An example is 2x2 cluster system where the system is booted with max
cpus = 2 and then cpu3 could be onlined first by userspace.


Because we are getting rid of much of the complexity now, I do not want
policy->cpu to keep changing. Just fix it up to the cpu for which the policy
gets created first. That's it. No more changes required. It doesn't matter at
userspace which cpu owns it as symlinks would anyway duplicate it under
every cpu.


I think you missed one my of comments in the email. I agree with what 
you are saying here. I'll just do it as a separate patch to keep this 
one simpler. I don't want to touch all the governors and other potential 
uses of policy->cpu in this patch.



Yeah, it is pretty convolution. But pretty much anywhere in the gov code
where policy->cpu is used could cause this. The specific crash I hit was in
this code:

static void od_dbs_timer(struct work_struct *work)
{
 struct od_cpu_dbs_info_s *dbs_info =
 container_of(work, struct od_cpu_dbs_info_s,
cdbs.work.work);
 unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;

=== CPU is policy->cpu here.

 struct od_cpu_dbs_info_s *core_dbs_info = _cpu(od_cpu_dbs_info,
 cpu);

=== Picks the per CPU struct of an offline CPU



 mutex_lock(_dbs_info->cdbs.timer_mutex);

=== Dies trying to lock a destroyed mutex


I am still not getting it. Why would we get into this if policy->cpu is fixed
once at boot ?



Yeah, it definitely crashes if policy->cpu if an offline cpu. Because 
the mutex would be uninitialized if it's stopped after boot or it would 
never have been initialized (depending on how you fix policy->cpu at boot).


Look at this snippet on the actual tree and it should be pretty evident.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 12 July 2014 08:36, Saravana Kannan  wrote:
> On 07/10/2014 11:19 PM, Viresh Kumar wrote:
>
>>
>> Please make sure you take care of these issues:
>> - suspend/resume
>> - hotplug
>> - module insert/remove
>
> Ok, I was just at the current code. Does cpufreq_unregister_driver() even
> really work correctly as it stands?
>
> It doesn't even seem to stop any of the governors/policies before it just
> set the cpufreq_driver pointer to NULL.
>
> So, technically my v2 patch doesn't even make anything worse when it comes
> to unregistering the cpufreq driver.

Are you really sure about this? I have tested this *myself* earlier..

subsys_interface_unregister() should take care of stopping/freeing governor
stuff..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 12 July 2014 08:14, Saravana Kannan  wrote:

>>> I'm just always adding the real nodes to the first CPU in a cluster
>>> independent of which CPU gets added first. Makes it easier to know which
>>> ones to symlink. See comment next to policy->cpu for full context.
>>
>>
>> Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
>> will be called. So, isn't policy->cpu the right CPU always?
>
>
> No, the "first" cpu in a cluster doesn't need to be the first one to be
> added. An example is 2x2 cluster system where the system is booted with max
> cpus = 2 and then cpu3 could be onlined first by userspace.

Because we are getting rid of much of the complexity now, I do not want
policy->cpu to keep changing. Just fix it up to the cpu for which the policy
gets created first. That's it. No more changes required. It doesn't matter at
userspace which cpu owns it as symlinks would anyway duplicate it under
every cpu.

> Yeah, it is pretty convolution. But pretty much anywhere in the gov code
> where policy->cpu is used could cause this. The specific crash I hit was in
> this code:
>
> static void od_dbs_timer(struct work_struct *work)
> {
> struct od_cpu_dbs_info_s *dbs_info =
> container_of(work, struct od_cpu_dbs_info_s,
> cdbs.work.work);
> unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
>
> === CPU is policy->cpu here.
>
> struct od_cpu_dbs_info_s *core_dbs_info = _cpu(od_cpu_dbs_info,
> cpu);
>
> === Picks the per CPU struct of an offline CPU
>
> 
>
> mutex_lock(_dbs_info->cdbs.timer_mutex);
>
> === Dies trying to lock a destroyed mutex

I am still not getting it. Why would we get into this if policy->cpu is fixed
once at boot ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 12 July 2014 08:14, Saravana Kannan skan...@codeaurora.org wrote:

 I'm just always adding the real nodes to the first CPU in a cluster
 independent of which CPU gets added first. Makes it easier to know which
 ones to symlink. See comment next to policy-cpu for full context.


 Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
 will be called. So, isn't policy-cpu the right CPU always?


 No, the first cpu in a cluster doesn't need to be the first one to be
 added. An example is 2x2 cluster system where the system is booted with max
 cpus = 2 and then cpu3 could be onlined first by userspace.

Because we are getting rid of much of the complexity now, I do not want
policy-cpu to keep changing. Just fix it up to the cpu for which the policy
gets created first. That's it. No more changes required. It doesn't matter at
userspace which cpu owns it as symlinks would anyway duplicate it under
every cpu.

 Yeah, it is pretty convolution. But pretty much anywhere in the gov code
 where policy-cpu is used could cause this. The specific crash I hit was in
 this code:

 static void od_dbs_timer(struct work_struct *work)
 {
 struct od_cpu_dbs_info_s *dbs_info =
 container_of(work, struct od_cpu_dbs_info_s,
 cdbs.work.work);
 unsigned int cpu = dbs_info-cdbs.cur_policy-cpu;

 === CPU is policy-cpu here.

 struct od_cpu_dbs_info_s *core_dbs_info = per_cpu(od_cpu_dbs_info,
 cpu);

 === Picks the per CPU struct of an offline CPU

 snip

 mutex_lock(core_dbs_info-cdbs.timer_mutex);

 === Dies trying to lock a destroyed mutex

I am still not getting it. Why would we get into this if policy-cpu is fixed
once at boot ?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 12 July 2014 08:36, Saravana Kannan skan...@codeaurora.org wrote:
 On 07/10/2014 11:19 PM, Viresh Kumar wrote:


 Please make sure you take care of these issues:
 - suspend/resume
 - hotplug
 - module insert/remove

 Ok, I was just at the current code. Does cpufreq_unregister_driver() even
 really work correctly as it stands?

 It doesn't even seem to stop any of the governors/policies before it just
 set the cpufreq_driver pointer to NULL.

 So, technically my v2 patch doesn't even make anything worse when it comes
 to unregistering the cpufreq driver.

Are you really sure about this? I have tested this *myself* earlier..

subsys_interface_unregister() should take care of stopping/freeing governor
stuff..
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Saravana Kannan


On 07/13/2014 11:09 PM, Viresh Kumar wrote:

On 12 July 2014 08:14, Saravana Kannan skan...@codeaurora.org wrote:


I'm just always adding the real nodes to the first CPU in a cluster
independent of which CPU gets added first. Makes it easier to know which
ones to symlink. See comment next to policy-cpu for full context.



Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy-cpu the right CPU always?



No, the first cpu in a cluster doesn't need to be the first one to be
added. An example is 2x2 cluster system where the system is booted with max
cpus = 2 and then cpu3 could be onlined first by userspace.


Because we are getting rid of much of the complexity now, I do not want
policy-cpu to keep changing. Just fix it up to the cpu for which the policy
gets created first. That's it. No more changes required. It doesn't matter at
userspace which cpu owns it as symlinks would anyway duplicate it under
every cpu.


I think you missed one my of comments in the email. I agree with what 
you are saying here. I'll just do it as a separate patch to keep this 
one simpler. I don't want to touch all the governors and other potential 
uses of policy-cpu in this patch.



Yeah, it is pretty convolution. But pretty much anywhere in the gov code
where policy-cpu is used could cause this. The specific crash I hit was in
this code:

static void od_dbs_timer(struct work_struct *work)
{
 struct od_cpu_dbs_info_s *dbs_info =
 container_of(work, struct od_cpu_dbs_info_s,
cdbs.work.work);
 unsigned int cpu = dbs_info-cdbs.cur_policy-cpu;

=== CPU is policy-cpu here.

 struct od_cpu_dbs_info_s *core_dbs_info = per_cpu(od_cpu_dbs_info,
 cpu);

=== Picks the per CPU struct of an offline CPU

snip

 mutex_lock(core_dbs_info-cdbs.timer_mutex);

=== Dies trying to lock a destroyed mutex


I am still not getting it. Why would we get into this if policy-cpu is fixed
once at boot ?



Yeah, it definitely crashes if policy-cpu if an offline cpu. Because 
the mutex would be uninitialized if it's stopped after boot or it would 
never have been initialized (depending on how you fix policy-cpu at boot).


Look at this snippet on the actual tree and it should be pretty evident.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Saravana Kannan


On 07/13/2014 11:13 PM, Viresh Kumar wrote:

On 12 July 2014 08:36, Saravana Kannan skan...@codeaurora.org wrote:

On 07/10/2014 11:19 PM, Viresh Kumar wrote:



Please make sure you take care of these issues:
- suspend/resume
- hotplug
- module insert/remove


Ok, I was just at the current code. Does cpufreq_unregister_driver() even
really work correctly as it stands?

It doesn't even seem to stop any of the governors/policies before it just
set the cpufreq_driver pointer to NULL.

So, technically my v2 patch doesn't even make anything worse when it comes
to unregistering the cpufreq driver.


Are you really sure about this? I have tested this *myself* earlier..

subsys_interface_unregister() should take care of stopping/freeing governor
stuff..



I was asking this question based on looking at the code. Didn't actually 
try it -- sent it just before being done for the day. I didn't know 
about the subsys_interface_unregister() coming into play here. I'll take 
a look.


Thanks for the pointer.

-Saravana


--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 15 July 2014 00:38, Saravana Kannan skan...@codeaurora.org wrote:
 Yeah, it definitely crashes if policy-cpu if an offline cpu. Because the
 mutex would be uninitialized if it's stopped after boot or it would never
 have been initialized (depending on how you fix policy-cpu at boot).

 Look at this snippet on the actual tree and it should be pretty evident.

Yeah, I missed it. So the problem is we initialize timer_mutex's for
policy-cpus. So we need to do that just for policy-cpu and also we don't
need a per-cpu timer_mutex anymore.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Saravana Kannan


On 07/14/2014 09:35 PM, Viresh Kumar wrote:

On 15 July 2014 00:38, Saravana Kannan skan...@codeaurora.org wrote:

Yeah, it definitely crashes if policy-cpu if an offline cpu. Because the
mutex would be uninitialized if it's stopped after boot or it would never
have been initialized (depending on how you fix policy-cpu at boot).

Look at this snippet on the actual tree and it should be pretty evident.


Yeah, I missed it. So the problem is we initialize timer_mutex's for
policy-cpus. So we need to do that just for policy-cpu and also we don't
need a per-cpu timer_mutex anymore.



Btw, I tried to take a stab at removing any assumption in cpufreq code 
about policy-cpu being ONLINE. There are 160 instances of those of with 
23 are in cpufreq.c


So, even if we are sure cpufreq.c is fine, it's 137 other uses spread 
across all the other files. I definitely don't want to try and fix those 
as part of this patch. Way too risky and hard to get the test coverage 
it would need. Even some of the acpi cpufreq drivers seem to be making 
this assumption.


Btw, I think v3 is done. I did some testing and it was fine. But made 
some minor changes. Will test tomorrow to make sure I didn't break 
anything with the minor changes and then send them out.


-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-14 Thread Viresh Kumar

On 15 July 2014 11:06, Saravana Kannan skan...@codeaurora.org wrote:
 Btw, I tried to take a stab at removing any assumption in cpufreq code about
 policy-cpu being ONLINE. There are 160 instances of those of with 23 are in
 cpufreq.c

 So, even if we are sure cpufreq.c is fine, it's 137 other uses spread across
 all the other files. I definitely don't want to try and fix those as part of
 this patch. Way too risky and hard to get the test coverage it would need.
 Even some of the acpi cpufreq drivers seem to be making this assumption.

Hmm, yeah that would be an issue. So this is what you should do now:
- Left policy-cpu as it is, i.e. updated only when policy-cpu goes down.
- Just make sure sysfs nodes are untouched when any cpu goes down

 Btw, I think v3 is done. I did some testing and it was fine. But made some
 minor changes. Will test tomorrow to make sure I didn't break anything with
 the minor changes and then send them out.

Ok, just comply to the above comments.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Saravana Kannan


On 07/10/2014 11:19 PM, Viresh Kumar wrote:



Please make sure you take care of these issues:
- suspend/resume
- hotplug
- module insert/remove
Ok, I was just at the current code. Does cpufreq_unregister_driver() 
even really work correctly as it stands?


It doesn't even seem to stop any of the governors/policies before it 
just set the cpufreq_driver pointer to NULL.


So, technically my v2 patch doesn't even make anything worse when it 
comes to unregistering the cpufreq driver.


Similar issues for unregister_governor too!

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Saravana Kannan


On 07/11/2014 03:52 AM, Viresh Kumar wrote:

Just responding to one comment. The one about policy->cpu.




diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c



  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
  {
-   unsigned int j;
+   unsigned int j, first_cpu = cpumask_first(policy->related_cpus);
 int ret = 0;

-   for_each_cpu(j, policy->cpus) {
+   for_each_cpu(j, policy->related_cpus) {
 struct device *cpu_dev;

-   if (j == policy->cpu)
+   if (j == first_cpu)


why?


The first CPU is a cluster always own the real nodes.


What I meant was, why not use policy->cpu?


+static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
  {
 struct freq_attr **drv_attr;
+   struct device *dev;
 int ret = 0;

+   dev = get_cpu_device(cpumask_first(policy->related_cpus));
+   if (!dev)
+   return -EINVAL;
+


Why?


I'm just always adding the real nodes to the first CPU in a cluster
independent of which CPU gets added first. Makes it easier to know which
ones to symlink. See comment next to policy->cpu for full context.


Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy->cpu the right CPU always?


No, the "first" cpu in a cluster doesn't need to be the first one to be 
added. An example is 2x2 cluster system where the system is booted with 
max cpus = 2 and then cpu3 could be onlined first by userspace.





-   if (has_target()) {
+   cpus = cpumask_weight(policy->cpus);
+   policy->cpu = cpumask_first(policy->cpus);


why update it at all? Also, as per your logic what if cpus == 0?


Yeah, I didn't write it this way at first. But the governors are making
the assumption that policy->cpu is always an online CPU. So, they try to


Are you sure? I had a quick look and failed to see that..


queue work there and use data structs of that CPU (even if they free it in
the STOP event since it went offline).


So, it queues work on all policy->cpus, not policy->cpu.
And the data structures
are just allocated with a CPU number, its fine if its offline.

And where are we freeing that stuff in STOP ?

Sorry if I am really really tired and couldn't read it correctly.


Yeah, it is pretty convolution. But pretty much anywhere in the gov code 
where policy->cpu is used could cause this. The specific crash I hit was 
in this code:


static void od_dbs_timer(struct work_struct *work)
{
struct od_cpu_dbs_info_s *dbs_info =
container_of(work, struct od_cpu_dbs_info_s, cdbs.work.work);
unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;

=== CPU is policy->cpu here.

struct od_cpu_dbs_info_s *core_dbs_info = _cpu(od_cpu_dbs_info,
cpu);

=== Picks the per CPU struct of an offline CPU



mutex_lock(_dbs_info->cdbs.timer_mutex);

=== Dies trying to lock a destroyed mutex




Another option is to leave policy->cpu unchanged and then fix all the
governors. But this patch would get even more complicated. So, we can
leave this as is, or fix that up in a separate patch.


Since we are simplifying it here, I think we should NOT change policy->cpu
at all. It will make life simple (probably).


I agree, but then I would have to fix up the governors. In the interest 
of keeping this patch small. I'll continue with what I'm doing and fix 
it up in another patch.


-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Viresh Kumar

On 11 July 2014 15:29,   wrote:
> Viresh Kumar wrote:
>> On 11 July 2014 09:48, Saravana Kannan  wrote:

>>> * Policy settings and governor tunables maintained across suspend/resume
>>>   and hotplug.
>>
>> Its already maintained during suspend/resume.
>
> But not across hotplug. Which is also very useful when you have 2 clusters
> and one gets hotplugged offline due to thermal and then reinserted.
> Userspace has to come and restore it back today. In our tree, we "stitched
> up" the governor. Also, this make the suspend/resume code a tiny bit
> simpler -- in the sense, it's not a special case anymore.

Yeah, I understood that. I was just pointing that you need to mention
hotplug alone in the bullet point.

>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

>>>  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
>>>  {
>>> -   unsigned int j;
>>> +   unsigned int j, first_cpu = cpumask_first(policy->related_cpus);
>>> int ret = 0;
>>>
>>> -   for_each_cpu(j, policy->cpus) {
>>> +   for_each_cpu(j, policy->related_cpus) {
>>> struct device *cpu_dev;
>>>
>>> -   if (j == policy->cpu)
>>> +   if (j == first_cpu)
>>
>> why?
>
> The first CPU is a cluster always own the real nodes.

What I meant was, why not use policy->cpu?

>>> +static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
>>>  {
>>> struct freq_attr **drv_attr;
>>> +   struct device *dev;
>>> int ret = 0;
>>>
>>> +   dev = get_cpu_device(cpumask_first(policy->related_cpus));
>>> +   if (!dev)
>>> +   return -EINVAL;
>>> +
>>
>> Why?
>
> I'm just always adding the real nodes to the first CPU in a cluster
> independent of which CPU gets added first. Makes it easier to know which
> ones to symlink. See comment next to policy->cpu for full context.

Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy->cpu the right CPU always?

>>> +   blocking_notifier_call_chain(_policy_notifier_list,
>>> +   CPUFREQ_UPDATE_POLICY_CPU,
>>> policy);
>>
>> This should be only called when policy->cpu is updated. And shouldn't
>> be called anymore and can be dropped as you might not wanna change
>> policy->cpu after this patch.
>
> Right. I should have reordered this to after.
>
> But I always HAVE to send it when I add/remove a CPU. That's how I believe
> cpufreq stats can keep of CPUs going offline.

No. cpufreq-stats doesn't have to care about CPUs going in/out. It just
cares about poilcy->cpu and so we notify it when that gets updated.

> Oh, yeah. Which bring me to
> another point. If I'm not mistaken, cpufreq stats keeps track of policy
> stats. Really, it should track a CPU's stats. If it's hotplugged out, it
> should count that time towards it's current freq.

Even if CPU is hotplugged out, it is getting clock from the same pll. And
so if clock for rest of the CPUs change, it changes for the hotplugged
out one as well.. Even if it is not running.

So, we must account on the new frequencies.

>>> -   if (has_target()) {
>>> +   cpus = cpumask_weight(policy->cpus);
>>> +   policy->cpu = cpumask_first(policy->cpus);
>>
>> why update it at all? Also, as per your logic what if cpus == 0?
>
> Yeah, I didn't write it this way at first. But the governors are making
> the assumption that policy->cpu is always an online CPU. So, they try to

Are you sure? I had a quick look and failed to see that..

> queue work there and use data structs of that CPU (even if they free it in
> the STOP event since it went offline).

So, it queues work on all policy->cpus, not policy->cpu. And the data structures
are just allocated with a CPU number, its fine if its offline.

And where are we freeing that stuff in STOP ?

Sorry if I am really really tired and couldn't read it correctly.

> Another option is to leave policy->cpu unchanged and then fix all the
> governors. But this patch would get even more complicated. So, we can
> leave this as is, or fix that up in a separate patch.

Since we are simplifying it here, I think we should NOT change policy->cpu
at all. It will make life simple (probably).

>>> +   if (has_target() && cpus > 0) {
>>
>> Instead of > or < use cpus or !cpus.
>
> Kinda personal style I guess. cpus > 0 reads like "there's more than 0
> CPUs" which makes sense in English too. But sure, I can change if you
> really want to.

Actually at places its compared with 0 or 1, and so thought cpus, !cpus
would be better..

> Sorry if I gave the impression that this patch is complete.

No, you didn't :)

>>> +   cpufreq_driver->stop_cpu(policy);
>>
>> Where is ->exit() gone?
>>
>
> Why should I exit? I don't think we need to. I'm not planning on exit()
> and init() every time an entire cluster is offlined or onlined. Just stop
> the governor and let if go quiet.

Okay, my driver is compiled as a module. I

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread skannan


skan...@codeaurora.org wrote:
>
> Viresh Kumar wrote:
>> Hi Saravana,
>>
>> Thanks for trying this..
>>
>> On 11 July 2014 09:48, Saravana Kannan  wrote:
>>> The CPUfreq driver moves the cpufreq policy ownership between CPUs when
>>
>> s/driver/core
>
> Will do
>



S many typos. This is what happens when I send a late night email! If
a sentence sounds incomplete, this is why.

-Saravana

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread skannan


Srivatsa S. Bhat wrote:
> On 07/11/2014 09:48 AM, Saravana Kannan wrote:
>> The CPUfreq driver moves the cpufreq policy ownership between CPUs when
>> CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
>> moving policy ownership between CPUs, it also moves the cpufreq sysfs
>> directory between CPUs and also fixes up the symlinks of the other CPUs
>> in
>> the cluster.
>>
>> Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
>> directories are deleted, the kobject is released and the policy is
>> freed.
>> And when the first CPU in a cluster comes up, the policy is reallocated
>> and
>> initialized, kobject is acquired, the sysfs nodes are created or
>> symlinked,
>> etc.
>>
>> All these steps end up creating unnecessarily complicated code and
>> locking.
>> There's no real benefit to adding/removing/moving the sysfs nodes and
>> the
>> policy between CPUs. Other per CPU sysfs directories like power and
>> cpuidle
>> are left alone during hotplug. So there's some precedence to what this
>> patch is trying to do.
>>
>> This patch simplifies a lot of the code and locking by removing the
>> adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
>> directory and policy in place irrespective of whether the CPUs are
>> ONLINE/OFFLINE.
>>
>> Leaving the policy, sysfs and kobject in place also brings these
>> additional
>> benefits:
>> * Faster suspend/resume.
>> * Faster hotplug.
>> * Sysfs file permissions maintained across hotplug without userspace
>>   workarounds.
>> * Policy settings and governor tunables maintained across suspend/resume
>>   and hotplug.
>> * Cpufreq stats would be maintained across hotplug for all CPUs and can
>> be
>>   queried even after CPU goes OFFLINE.
>>
>> Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
>> Tested-by: Stephen Boyd 
>> Signed-off-by: Saravana Kannan 
>> ---
>>
>> Preliminary testing has been done. cpufreq directories are getting
>> created
>> properly. Online/offline of CPUs work. Policies remain unmodifiable from
>> userspace when all policy CPUs are offline.
>>
>> Error handling code has NOT been updated.
>>
>> I've added a bunch of FIXME comments next to where I'm not sure about
>> the
>> locking in the existing code. I believe most of the try_lock's were
>> present
>> to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
>> the sysfs entries are not touched after creating them, we should be able
>> to
>> replace most/all of these try_lock's with a normal lock.
>>
>> This patch has more room for code simplification, but I would like to
>> get
>> some acks for the functionality and this code before I do further
>> simplification.
>>
>
> The idea behind this work is very welcome indeed! IMHO, there is nothing
> conceptually wrong in maintaining the per-cpu sysfs files across CPU
> hotplug
> (as long as we take care to return appropriate error codes if userspace
> tries to set values using the control files of offline CPUs). So, it
> really
> boils down to whether or not we get the implementation right; the idea
> itself
> looks fine as of now. Hence, your efforts in making this patch(set) easier
> to
> review will certainly help. Perhaps you can simplify the code later, but
> at
> this point, splitting up this patch into multiple smaller, reviewable
> pieces
> (accompanied by well-written changelogs that explain the intent) is the
> utmost
> priority. Just like Viresh, even I had a hard time reviewing all of this
> in
> one go.
>
> Thank you for taking up this work!

Thanks for the support. I'll keep in mind to keep the patches simple and
not do unnecessary optimizations. But the first patch diff unfortunately
is going to be a bit big since it'll delete a lot of code. :( But I'll add
more detailed commit text or "cover" text in the next one. I don't want to
split up the patch so much that individual ones don't compile or boot.

Maybe after patch v3, if you guys can suggest splitting it up into chunks
that won't involve huge rewrites, I can try to do that.

-Saravana

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread skannan


Viresh Kumar wrote:
> Hi Saravana,
>
> Thanks for trying this..
>
> On 11 July 2014 09:48, Saravana Kannan  wrote:
>> The CPUfreq driver moves the cpufreq policy ownership between CPUs when
>
> s/driver/core

Will do

>
>> CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
>> moving policy ownership between CPUs, it also moves the cpufreq sysfs
>> directory between CPUs and also fixes up the symlinks of the other CPUs
>> in
>> the cluster.
>>
>> Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
>> directories are deleted, the kobject is released and the policy is
>> freed.
>> And when the first CPU in a cluster comes up, the policy is reallocated
>> and
>> initialized, kobject is acquired, the sysfs nodes are created or
>> symlinked,
>> etc.
>>
>> All these steps end up creating unnecessarily complicated code and
>> locking.
>> There's no real benefit to adding/removing/moving the sysfs nodes and
>> the
>> policy between CPUs. Other per CPU sysfs directories like power and
>> cpuidle
>> are left alone during hotplug. So there's some precedence to what this
>> patch is trying to do.
>>
>> This patch simplifies a lot of the code and locking by removing the
>> adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
>> directory and policy in place irrespective of whether the CPUs are
>> ONLINE/OFFLINE.
>>
>> Leaving the policy, sysfs and kobject in place also brings these
>> additional
>> benefits:
>> * Faster suspend/resume.
>> * Faster hotplug.
>> * Sysfs file permissions maintained across hotplug without userspace
>>   workarounds.
>> * Policy settings and governor tunables maintained across suspend/resume
>>   and hotplug.
>
> Its already maintained during suspend/resume.

But not across hotplug. Which is also very useful when you have 2 clusters
and one gets hotplugged offline due to thermal and then reinserted.
Userspace has to come and restore it back today. In our tree, we "stitched
up" the governor. Also, this make the suspend/resume code a tiny bit
simpler -- in the sense, it's not a special case anymore.

>> * Cpufreq stats would be maintained across hotplug for all CPUs and can
>> be
>>   queried even after CPU goes OFFLINE.
>>
>> Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
>
> remove these while sending stuff upstream..

Yeah, I always think of doing this but keep forgetting in the last minute.

>> Tested-by: Stephen Boyd 
>> Signed-off-by: Saravana Kannan 
>> ---
>>
>> Preliminary testing has been done. cpufreq directories are getting
>> created
>> properly. Online/offline of CPUs work. Policies remain unmodifiable from
>> userspace when all policy CPUs are offline.
>>
>> Error handling code has NOT been updated.
>>
>> I've added a bunch of FIXME comments next to where I'm not sure about
>> the
>> locking in the existing code. I believe most of the try_lock's were
>> present
>> to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
>> the sysfs entries are not touched after creating them, we should be able
>> to
>> replace most/all of these try_lock's with a normal lock.
>>
>> This patch has more room for code simplification, but I would like to
>> get
>> some acks for the functionality and this code before I do further
>> simplification.
>>
>> I should also be able to remove get_online_cpus() in the store function
>> and
>> replace it with just a check for policy->governor_enabled. That should
>> theoretically reduce some contention between cpufreq stats check and
>> hotplug of unrelated CPUs.
>
> Its just too much stuff in a single patch, I can still review it as I
> am very much
> aware of every bit of code written here. But would be very difficult for
> others
> to review it. These are so many cases, configuration we have to think of
> and adding bugs with such a large patch is so so so easy.

Actually this is the smallest bit of code that will work. Well, after I
fix suspend/resume. I'm trying to make each patch such that the tree
continues to work after it's pulled in.

Unfortunately, I'm throwing away a lot of code that it ends up with a
fairly large diff. But if you look at the actual final code, it's very
simple.

But I do see your point. I'll try to keep the patch as small as possible
to continue working and make additional improvements as additional
patches.

>>  drivers/cpufreq/cpufreq.c | 331
>> ++
>>  1 file changed, 69 insertions(+), 262 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 62259d2..e350b15 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct
>> attribute *attr)
>>  }
>>  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
>>
>> -/* symlink affected CPUs */
>> +/* symlink related CPUs */
>>  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
>>  {
>> -   unsigned int j;
>> +   unsigned

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Srivatsa S. Bhat

On 07/11/2014 09:48 AM, Saravana Kannan wrote:
> The CPUfreq driver moves the cpufreq policy ownership between CPUs when
> CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
> moving policy ownership between CPUs, it also moves the cpufreq sysfs
> directory between CPUs and also fixes up the symlinks of the other CPUs in
> the cluster.
> 
> Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
> directories are deleted, the kobject is released and the policy is freed.
> And when the first CPU in a cluster comes up, the policy is reallocated and
> initialized, kobject is acquired, the sysfs nodes are created or symlinked,
> etc.
> 
> All these steps end up creating unnecessarily complicated code and locking.
> There's no real benefit to adding/removing/moving the sysfs nodes and the
> policy between CPUs. Other per CPU sysfs directories like power and cpuidle
> are left alone during hotplug. So there's some precedence to what this
> patch is trying to do.
> 
> This patch simplifies a lot of the code and locking by removing the
> adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
> directory and policy in place irrespective of whether the CPUs are
> ONLINE/OFFLINE.
> 
> Leaving the policy, sysfs and kobject in place also brings these additional
> benefits:
> * Faster suspend/resume.
> * Faster hotplug.
> * Sysfs file permissions maintained across hotplug without userspace
>   workarounds.
> * Policy settings and governor tunables maintained across suspend/resume
>   and hotplug.
> * Cpufreq stats would be maintained across hotplug for all CPUs and can be
>   queried even after CPU goes OFFLINE.
> 
> Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
> Tested-by: Stephen Boyd 
> Signed-off-by: Saravana Kannan 
> ---
> 
> Preliminary testing has been done. cpufreq directories are getting created
> properly. Online/offline of CPUs work. Policies remain unmodifiable from
> userspace when all policy CPUs are offline.
> 
> Error handling code has NOT been updated.
> 
> I've added a bunch of FIXME comments next to where I'm not sure about the
> locking in the existing code. I believe most of the try_lock's were present
> to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
> the sysfs entries are not touched after creating them, we should be able to
> replace most/all of these try_lock's with a normal lock.
> 
> This patch has more room for code simplification, but I would like to get
> some acks for the functionality and this code before I do further
> simplification.
> 

The idea behind this work is very welcome indeed! IMHO, there is nothing
conceptually wrong in maintaining the per-cpu sysfs files across CPU hotplug
(as long as we take care to return appropriate error codes if userspace
tries to set values using the control files of offline CPUs). So, it really
boils down to whether or not we get the implementation right; the idea itself
looks fine as of now. Hence, your efforts in making this patch(set) easier to
review will certainly help. Perhaps you can simplify the code later, but at
this point, splitting up this patch into multiple smaller, reviewable pieces
(accompanied by well-written changelogs that explain the intent) is the utmost
priority. Just like Viresh, even I had a hard time reviewing all of this in
one go.

Thank you for taking up this work!

Regards,
Srivatsa S. Bhat

> I should also be able to remove get_online_cpus() in the store function and
> replace it with just a check for policy->governor_enabled. That should
> theoretically reduce some contention between cpufreq stats check and
> hotplug of unrelated CPUs.
> 
> Appreciate all the feedback.
> 
> Thanks,
> Saravana
> 
>  drivers/cpufreq/cpufreq.c | 331 
> ++
>  1 file changed, 69 insertions(+), 262 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 62259d2..e350b15 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct attribute 
> *attr)
>  }
>  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
> 
> -/* symlink affected CPUs */
> +/* symlink related CPUs */
>  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
>  {
> - unsigned int j;
> + unsigned int j, first_cpu = cpumask_first(policy->related_cpus);
>   int ret = 0;
> 
> - for_each_cpu(j, policy->cpus) {
> + for_each_cpu(j, policy->related_cpus) {
>   struct device *cpu_dev;
> 
> - if (j == policy->cpu)
> + if (j == first_cpu)
>   continue;
> 
>   pr_debug("Adding link for CPU: %u\n", j);
> @@ -881,12 +881,16 @@ static int cpufreq_add_dev_symlink(struct 
> cpufreq_policy *policy)
>   return ret;
>  }
> 
> -static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
> -  struct device *dev)
>

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Viresh Kumar

Hi Saravana,

Thanks for trying this..

On 11 July 2014 09:48, Saravana Kannan  wrote:
> The CPUfreq driver moves the cpufreq policy ownership between CPUs when

s/driver/core

> CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
> moving policy ownership between CPUs, it also moves the cpufreq sysfs
> directory between CPUs and also fixes up the symlinks of the other CPUs in
> the cluster.
>
> Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
> directories are deleted, the kobject is released and the policy is freed.
> And when the first CPU in a cluster comes up, the policy is reallocated and
> initialized, kobject is acquired, the sysfs nodes are created or symlinked,
> etc.
>
> All these steps end up creating unnecessarily complicated code and locking.
> There's no real benefit to adding/removing/moving the sysfs nodes and the
> policy between CPUs. Other per CPU sysfs directories like power and cpuidle
> are left alone during hotplug. So there's some precedence to what this
> patch is trying to do.
>
> This patch simplifies a lot of the code and locking by removing the
> adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
> directory and policy in place irrespective of whether the CPUs are
> ONLINE/OFFLINE.
>
> Leaving the policy, sysfs and kobject in place also brings these additional
> benefits:
> * Faster suspend/resume.
> * Faster hotplug.
> * Sysfs file permissions maintained across hotplug without userspace
>   workarounds.
> * Policy settings and governor tunables maintained across suspend/resume
>   and hotplug.

Its already maintained during suspend/resume.

> * Cpufreq stats would be maintained across hotplug for all CPUs and can be
>   queried even after CPU goes OFFLINE.
>
> Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0

remove these while sending stuff upstream..

> Tested-by: Stephen Boyd 
> Signed-off-by: Saravana Kannan 
> ---
>
> Preliminary testing has been done. cpufreq directories are getting created
> properly. Online/offline of CPUs work. Policies remain unmodifiable from
> userspace when all policy CPUs are offline.
>
> Error handling code has NOT been updated.
>
> I've added a bunch of FIXME comments next to where I'm not sure about the
> locking in the existing code. I believe most of the try_lock's were present
> to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
> the sysfs entries are not touched after creating them, we should be able to
> replace most/all of these try_lock's with a normal lock.
>
> This patch has more room for code simplification, but I would like to get
> some acks for the functionality and this code before I do further
> simplification.
>
> I should also be able to remove get_online_cpus() in the store function and
> replace it with just a check for policy->governor_enabled. That should
> theoretically reduce some contention between cpufreq stats check and
> hotplug of unrelated CPUs.

Its just too much stuff in a single patch, I can still review it as I
am very much
aware of every bit of code written here. But would be very difficult for others
to review it. These are so many cases, configuration we have to think of
and adding bugs with such a large patch is so so so easy.

>  drivers/cpufreq/cpufreq.c | 331 
> ++
>  1 file changed, 69 insertions(+), 262 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 62259d2..e350b15 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct attribute 
> *attr)
>  }
>  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
>
> -/* symlink affected CPUs */
> +/* symlink related CPUs */
>  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
>  {
> -   unsigned int j;
> +   unsigned int j, first_cpu = cpumask_first(policy->related_cpus);
> int ret = 0;
>
> -   for_each_cpu(j, policy->cpus) {
> +   for_each_cpu(j, policy->related_cpus) {
> struct device *cpu_dev;
>
> -   if (j == policy->cpu)
> +   if (j == first_cpu)

why?

> continue;
>
> pr_debug("Adding link for CPU: %u\n", j);
> @@ -881,12 +881,16 @@ static int cpufreq_add_dev_symlink(struct 
> cpufreq_policy *policy)
> return ret;
>  }
>
> -static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
> -struct device *dev)
> +static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
>  {
> struct freq_attr **drv_attr;
> +   struct device *dev;
> int ret = 0;
>
> +   dev = get_cpu_device(cpumask_first(policy->related_cpus));
> +   if (!dev)
> +   return -EINVAL;
> +

Why?

> /* prepare interface data */
> ret = kobject_init_and_add(>kobj, _cpufreq,
>

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Viresh Kumar

Hi Saravana,

Thanks for trying this..

On 11 July 2014 09:48, Saravana Kannan skan...@codeaurora.org wrote:
 The CPUfreq driver moves the cpufreq policy ownership between CPUs when

s/driver/core

 CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
 moving policy ownership between CPUs, it also moves the cpufreq sysfs
 directory between CPUs and also fixes up the symlinks of the other CPUs in
 the cluster.

 Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
 directories are deleted, the kobject is released and the policy is freed.
 And when the first CPU in a cluster comes up, the policy is reallocated and
 initialized, kobject is acquired, the sysfs nodes are created or symlinked,
 etc.

 All these steps end up creating unnecessarily complicated code and locking.
 There's no real benefit to adding/removing/moving the sysfs nodes and the
 policy between CPUs. Other per CPU sysfs directories like power and cpuidle
 are left alone during hotplug. So there's some precedence to what this
 patch is trying to do.

 This patch simplifies a lot of the code and locking by removing the
 adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
 directory and policy in place irrespective of whether the CPUs are
 ONLINE/OFFLINE.

 Leaving the policy, sysfs and kobject in place also brings these additional
 benefits:
 * Faster suspend/resume.
 * Faster hotplug.
 * Sysfs file permissions maintained across hotplug without userspace
   workarounds.
 * Policy settings and governor tunables maintained across suspend/resume
   and hotplug.

Its already maintained during suspend/resume.

 * Cpufreq stats would be maintained across hotplug for all CPUs and can be
   queried even after CPU goes OFFLINE.

 Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0

remove these while sending stuff upstream..

 Tested-by: Stephen Boyd sb...@codeaurora.org
 Signed-off-by: Saravana Kannan skan...@codeaurora.org
 ---

 Preliminary testing has been done. cpufreq directories are getting created
 properly. Online/offline of CPUs work. Policies remain unmodifiable from
 userspace when all policy CPUs are offline.

 Error handling code has NOT been updated.

 I've added a bunch of FIXME comments next to where I'm not sure about the
 locking in the existing code. I believe most of the try_lock's were present
 to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
 the sysfs entries are not touched after creating them, we should be able to
 replace most/all of these try_lock's with a normal lock.

 This patch has more room for code simplification, but I would like to get
 some acks for the functionality and this code before I do further
 simplification.

 I should also be able to remove get_online_cpus() in the store function and
 replace it with just a check for policy-governor_enabled. That should
 theoretically reduce some contention between cpufreq stats check and
 hotplug of unrelated CPUs.

Its just too much stuff in a single patch, I can still review it as I
am very much
aware of every bit of code written here. But would be very difficult for others
to review it. These are so many cases, configuration we have to think of
and adding bugs with such a large patch is so so so easy.

  drivers/cpufreq/cpufreq.c | 331 
 ++
  1 file changed, 69 insertions(+), 262 deletions(-)

 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
 index 62259d2..e350b15 100644
 --- a/drivers/cpufreq/cpufreq.c
 +++ b/drivers/cpufreq/cpufreq.c
 @@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct attribute 
 *attr)
  }
  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);

 -/* symlink affected CPUs */
 +/* symlink related CPUs */
  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
  {
 -   unsigned int j;
 +   unsigned int j, first_cpu = cpumask_first(policy-related_cpus);
 int ret = 0;

 -   for_each_cpu(j, policy-cpus) {
 +   for_each_cpu(j, policy-related_cpus) {
 struct device *cpu_dev;

 -   if (j == policy-cpu)
 +   if (j == first_cpu)

why?

 continue;

 pr_debug(Adding link for CPU: %u\n, j);
 @@ -881,12 +881,16 @@ static int cpufreq_add_dev_symlink(struct 
 cpufreq_policy *policy)
 return ret;
  }

 -static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
 -struct device *dev)
 +static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
  {
 struct freq_attr **drv_attr;
 +   struct device *dev;
 int ret = 0;

 +   dev = get_cpu_device(cpumask_first(policy-related_cpus));
 +   if (!dev)
 +   return -EINVAL;
 +

Why?

 /* prepare interface data */
 ret = kobject_init_and_add(policy-kobj, ktype_cpufreq,
dev-kobj, cpufreq);
 @@ -961,60 +965,53 @@

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Srivatsa S. Bhat

On 07/11/2014 09:48 AM, Saravana Kannan wrote:
 The CPUfreq driver moves the cpufreq policy ownership between CPUs when
 CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
 moving policy ownership between CPUs, it also moves the cpufreq sysfs
 directory between CPUs and also fixes up the symlinks of the other CPUs in
 the cluster.
 
 Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
 directories are deleted, the kobject is released and the policy is freed.
 And when the first CPU in a cluster comes up, the policy is reallocated and
 initialized, kobject is acquired, the sysfs nodes are created or symlinked,
 etc.
 
 All these steps end up creating unnecessarily complicated code and locking.
 There's no real benefit to adding/removing/moving the sysfs nodes and the
 policy between CPUs. Other per CPU sysfs directories like power and cpuidle
 are left alone during hotplug. So there's some precedence to what this
 patch is trying to do.
 
 This patch simplifies a lot of the code and locking by removing the
 adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
 directory and policy in place irrespective of whether the CPUs are
 ONLINE/OFFLINE.
 
 Leaving the policy, sysfs and kobject in place also brings these additional
 benefits:
 * Faster suspend/resume.
 * Faster hotplug.
 * Sysfs file permissions maintained across hotplug without userspace
   workarounds.
 * Policy settings and governor tunables maintained across suspend/resume
   and hotplug.
 * Cpufreq stats would be maintained across hotplug for all CPUs and can be
   queried even after CPU goes OFFLINE.
 
 Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
 Tested-by: Stephen Boyd sb...@codeaurora.org
 Signed-off-by: Saravana Kannan skan...@codeaurora.org
 ---
 
 Preliminary testing has been done. cpufreq directories are getting created
 properly. Online/offline of CPUs work. Policies remain unmodifiable from
 userspace when all policy CPUs are offline.
 
 Error handling code has NOT been updated.
 
 I've added a bunch of FIXME comments next to where I'm not sure about the
 locking in the existing code. I believe most of the try_lock's were present
 to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
 the sysfs entries are not touched after creating them, we should be able to
 replace most/all of these try_lock's with a normal lock.
 
 This patch has more room for code simplification, but I would like to get
 some acks for the functionality and this code before I do further
 simplification.
 

The idea behind this work is very welcome indeed! IMHO, there is nothing
conceptually wrong in maintaining the per-cpu sysfs files across CPU hotplug
(as long as we take care to return appropriate error codes if userspace
tries to set values using the control files of offline CPUs). So, it really
boils down to whether or not we get the implementation right; the idea itself
looks fine as of now. Hence, your efforts in making this patch(set) easier to
review will certainly help. Perhaps you can simplify the code later, but at
this point, splitting up this patch into multiple smaller, reviewable pieces
(accompanied by well-written changelogs that explain the intent) is the utmost
priority. Just like Viresh, even I had a hard time reviewing all of this in
one go.

Thank you for taking up this work!

Regards,
Srivatsa S. Bhat

 I should also be able to remove get_online_cpus() in the store function and
 replace it with just a check for policy-governor_enabled. That should
 theoretically reduce some contention between cpufreq stats check and
 hotplug of unrelated CPUs.
 
 Appreciate all the feedback.
 
 Thanks,
 Saravana
 
  drivers/cpufreq/cpufreq.c | 331 
 ++
  1 file changed, 69 insertions(+), 262 deletions(-)
 
 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
 index 62259d2..e350b15 100644
 --- a/drivers/cpufreq/cpufreq.c
 +++ b/drivers/cpufreq/cpufreq.c
 @@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct attribute 
 *attr)
  }
  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
 
 -/* symlink affected CPUs */
 +/* symlink related CPUs */
  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
  {
 - unsigned int j;
 + unsigned int j, first_cpu = cpumask_first(policy-related_cpus);
   int ret = 0;
 
 - for_each_cpu(j, policy-cpus) {
 + for_each_cpu(j, policy-related_cpus) {
   struct device *cpu_dev;
 
 - if (j == policy-cpu)
 + if (j == first_cpu)
   continue;
 
   pr_debug(Adding link for CPU: %u\n, j);
 @@ -881,12 +881,16 @@ static int cpufreq_add_dev_symlink(struct 
 cpufreq_policy *policy)
   return ret;
  }
 
 -static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
 -  struct device *dev)
 +static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread skannan


Viresh Kumar wrote:
 Hi Saravana,

 Thanks for trying this..

 On 11 July 2014 09:48, Saravana Kannan skan...@codeaurora.org wrote:
 The CPUfreq driver moves the cpufreq policy ownership between CPUs when

 s/driver/core

Will do


 CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
 moving policy ownership between CPUs, it also moves the cpufreq sysfs
 directory between CPUs and also fixes up the symlinks of the other CPUs
 in
 the cluster.

 Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
 directories are deleted, the kobject is released and the policy is
 freed.
 And when the first CPU in a cluster comes up, the policy is reallocated
 and
 initialized, kobject is acquired, the sysfs nodes are created or
 symlinked,
 etc.

 All these steps end up creating unnecessarily complicated code and
 locking.
 There's no real benefit to adding/removing/moving the sysfs nodes and
 the
 policy between CPUs. Other per CPU sysfs directories like power and
 cpuidle
 are left alone during hotplug. So there's some precedence to what this
 patch is trying to do.

 This patch simplifies a lot of the code and locking by removing the
 adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
 directory and policy in place irrespective of whether the CPUs are
 ONLINE/OFFLINE.

 Leaving the policy, sysfs and kobject in place also brings these
 additional
 benefits:
 * Faster suspend/resume.
 * Faster hotplug.
 * Sysfs file permissions maintained across hotplug without userspace
   workarounds.
 * Policy settings and governor tunables maintained across suspend/resume
   and hotplug.

 Its already maintained during suspend/resume.

But not across hotplug. Which is also very useful when you have 2 clusters
and one gets hotplugged offline due to thermal and then reinserted.
Userspace has to come and restore it back today. In our tree, we stitched
up the governor. Also, this make the suspend/resume code a tiny bit
simpler -- in the sense, it's not a special case anymore.

 * Cpufreq stats would be maintained across hotplug for all CPUs and can
 be
   queried even after CPU goes OFFLINE.

 Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0

 remove these while sending stuff upstream..

Yeah, I always think of doing this but keep forgetting in the last minute.

 Tested-by: Stephen Boyd sb...@codeaurora.org
 Signed-off-by: Saravana Kannan skan...@codeaurora.org
 ---

 Preliminary testing has been done. cpufreq directories are getting
 created
 properly. Online/offline of CPUs work. Policies remain unmodifiable from
 userspace when all policy CPUs are offline.

 Error handling code has NOT been updated.

 I've added a bunch of FIXME comments next to where I'm not sure about
 the
 locking in the existing code. I believe most of the try_lock's were
 present
 to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
 the sysfs entries are not touched after creating them, we should be able
 to
 replace most/all of these try_lock's with a normal lock.

 This patch has more room for code simplification, but I would like to
 get
 some acks for the functionality and this code before I do further
 simplification.

 I should also be able to remove get_online_cpus() in the store function
 and
 replace it with just a check for policy-governor_enabled. That should
 theoretically reduce some contention between cpufreq stats check and
 hotplug of unrelated CPUs.

 Its just too much stuff in a single patch, I can still review it as I
 am very much
 aware of every bit of code written here. But would be very difficult for
 others
 to review it. These are so many cases, configuration we have to think of
 and adding bugs with such a large patch is so so so easy.

Actually this is the smallest bit of code that will work. Well, after I
fix suspend/resume. I'm trying to make each patch such that the tree
continues to work after it's pulled in.

Unfortunately, I'm throwing away a lot of code that it ends up with a
fairly large diff. But if you look at the actual final code, it's very
simple.

But I do see your point. I'll try to keep the patch as small as possible
to continue working and make additional improvements as additional
patches.

  drivers/cpufreq/cpufreq.c | 331
 ++
  1 file changed, 69 insertions(+), 262 deletions(-)

 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
 index 62259d2..e350b15 100644
 --- a/drivers/cpufreq/cpufreq.c
 +++ b/drivers/cpufreq/cpufreq.c
 @@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct
 attribute *attr)
  }
  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);

 -/* symlink affected CPUs */
 +/* symlink related CPUs */
  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
  {
 -   unsigned int j;
 +   unsigned int j, first_cpu = cpumask_first(policy-related_cpus);
 int ret = 0;

 -   for_each_cpu(j, policy-cpus) {
 +   for_each_cpu(j,

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread skannan


Srivatsa S. Bhat wrote:
 On 07/11/2014 09:48 AM, Saravana Kannan wrote:
 The CPUfreq driver moves the cpufreq policy ownership between CPUs when
 CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
 moving policy ownership between CPUs, it also moves the cpufreq sysfs
 directory between CPUs and also fixes up the symlinks of the other CPUs
 in
 the cluster.

 Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
 directories are deleted, the kobject is released and the policy is
 freed.
 And when the first CPU in a cluster comes up, the policy is reallocated
 and
 initialized, kobject is acquired, the sysfs nodes are created or
 symlinked,
 etc.

 All these steps end up creating unnecessarily complicated code and
 locking.
 There's no real benefit to adding/removing/moving the sysfs nodes and
 the
 policy between CPUs. Other per CPU sysfs directories like power and
 cpuidle
 are left alone during hotplug. So there's some precedence to what this
 patch is trying to do.

 This patch simplifies a lot of the code and locking by removing the
 adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
 directory and policy in place irrespective of whether the CPUs are
 ONLINE/OFFLINE.

 Leaving the policy, sysfs and kobject in place also brings these
 additional
 benefits:
 * Faster suspend/resume.
 * Faster hotplug.
 * Sysfs file permissions maintained across hotplug without userspace
   workarounds.
 * Policy settings and governor tunables maintained across suspend/resume
   and hotplug.
 * Cpufreq stats would be maintained across hotplug for all CPUs and can
 be
   queried even after CPU goes OFFLINE.

 Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
 Tested-by: Stephen Boyd sb...@codeaurora.org
 Signed-off-by: Saravana Kannan skan...@codeaurora.org
 ---

 Preliminary testing has been done. cpufreq directories are getting
 created
 properly. Online/offline of CPUs work. Policies remain unmodifiable from
 userspace when all policy CPUs are offline.

 Error handling code has NOT been updated.

 I've added a bunch of FIXME comments next to where I'm not sure about
 the
 locking in the existing code. I believe most of the try_lock's were
 present
 to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
 the sysfs entries are not touched after creating them, we should be able
 to
 replace most/all of these try_lock's with a normal lock.

 This patch has more room for code simplification, but I would like to
 get
 some acks for the functionality and this code before I do further
 simplification.


 The idea behind this work is very welcome indeed! IMHO, there is nothing
 conceptually wrong in maintaining the per-cpu sysfs files across CPU
 hotplug
 (as long as we take care to return appropriate error codes if userspace
 tries to set values using the control files of offline CPUs). So, it
 really
 boils down to whether or not we get the implementation right; the idea
 itself
 looks fine as of now. Hence, your efforts in making this patch(set) easier
 to
 review will certainly help. Perhaps you can simplify the code later, but
 at
 this point, splitting up this patch into multiple smaller, reviewable
 pieces
 (accompanied by well-written changelogs that explain the intent) is the
 utmost
 priority. Just like Viresh, even I had a hard time reviewing all of this
 in
 one go.

 Thank you for taking up this work!

Thanks for the support. I'll keep in mind to keep the patches simple and
not do unnecessary optimizations. But the first patch diff unfortunately
is going to be a bit big since it'll delete a lot of code. :( But I'll add
more detailed commit text or cover text in the next one. I don't want to
split up the patch so much that individual ones don't compile or boot.

Maybe after patch v3, if you guys can suggest splitting it up into chunks
that won't involve huge rewrites, I can try to do that.

-Saravana

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread skannan


skan...@codeaurora.org wrote:

 Viresh Kumar wrote:
 Hi Saravana,

 Thanks for trying this..

 On 11 July 2014 09:48, Saravana Kannan skan...@codeaurora.org wrote:
 The CPUfreq driver moves the cpufreq policy ownership between CPUs when

 s/driver/core

 Will do


snip

S many typos. This is what happens when I send a late night email! If
a sentence sounds incomplete, this is why.

-Saravana

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Viresh Kumar

On 11 July 2014 15:29,  skan...@codeaurora.org wrote:
 Viresh Kumar wrote:
 On 11 July 2014 09:48, Saravana Kannan skan...@codeaurora.org wrote:

 * Policy settings and governor tunables maintained across suspend/resume
   and hotplug.

 Its already maintained during suspend/resume.

 But not across hotplug. Which is also very useful when you have 2 clusters
 and one gets hotplugged offline due to thermal and then reinserted.
 Userspace has to come and restore it back today. In our tree, we stitched
 up the governor. Also, this make the suspend/resume code a tiny bit
 simpler -- in the sense, it's not a special case anymore.

Yeah, I understood that. I was just pointing that you need to mention
hotplug alone in the bullet point.

 diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
  {
 -   unsigned int j;
 +   unsigned int j, first_cpu = cpumask_first(policy-related_cpus);
 int ret = 0;

 -   for_each_cpu(j, policy-cpus) {
 +   for_each_cpu(j, policy-related_cpus) {
 struct device *cpu_dev;

 -   if (j == policy-cpu)
 +   if (j == first_cpu)

 why?

 The first CPU is a cluster always own the real nodes.

What I meant was, why not use policy-cpu?

 +static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
  {
 struct freq_attr **drv_attr;
 +   struct device *dev;
 int ret = 0;

 +   dev = get_cpu_device(cpumask_first(policy-related_cpus));
 +   if (!dev)
 +   return -EINVAL;
 +

 Why?

 I'm just always adding the real nodes to the first CPU in a cluster
 independent of which CPU gets added first. Makes it easier to know which
 ones to symlink. See comment next to policy-cpu for full context.

Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy-cpu the right CPU always?

 +   blocking_notifier_call_chain(cpufreq_policy_notifier_list,
 +   CPUFREQ_UPDATE_POLICY_CPU,
 policy);

 This should be only called when policy-cpu is updated. And shouldn't
 be called anymore and can be dropped as you might not wanna change
 policy-cpu after this patch.

 Right. I should have reordered this to after.

 But I always HAVE to send it when I add/remove a CPU. That's how I believe
 cpufreq stats can keep of CPUs going offline.

No. cpufreq-stats doesn't have to care about CPUs going in/out. It just
cares about poilcy-cpu and so we notify it when that gets updated.

 Oh, yeah. Which bring me to
 another point. If I'm not mistaken, cpufreq stats keeps track of policy
 stats. Really, it should track a CPU's stats. If it's hotplugged out, it
 should count that time towards it's current freq.

Even if CPU is hotplugged out, it is getting clock from the same pll. And
so if clock for rest of the CPUs change, it changes for the hotplugged
out one as well.. Even if it is not running.

So, we must account on the new frequencies.

 -   if (has_target()) {
 +   cpus = cpumask_weight(policy-cpus);
 +   policy-cpu = cpumask_first(policy-cpus);

 why update it at all? Also, as per your logic what if cpus == 0?

 Yeah, I didn't write it this way at first. But the governors are making
 the assumption that policy-cpu is always an online CPU. So, they try to

Are you sure? I had a quick look and failed to see that..

 queue work there and use data structs of that CPU (even if they free it in
 the STOP event since it went offline).

So, it queues work on all policy-cpus, not policy-cpu. And the data structures
are just allocated with a CPU number, its fine if its offline.

And where are we freeing that stuff in STOP ?

Sorry if I am really really tired and couldn't read it correctly.

 Another option is to leave policy-cpu unchanged and then fix all the
 governors. But this patch would get even more complicated. So, we can
 leave this as is, or fix that up in a separate patch.

Since we are simplifying it here, I think we should NOT change policy-cpu
at all. It will make life simple (probably).

 +   if (has_target()  cpus  0) {

 Instead of  or  use cpus or !cpus.

 Kinda personal style I guess. cpus  0 reads like there's more than 0
 CPUs which makes sense in English too. But sure, I can change if you
 really want to.

Actually at places its compared with 0 or 1, and so thought cpus, !cpus
would be better..

 Sorry if I gave the impression that this patch is complete.

No, you didn't :)

 +   cpufreq_driver-stop_cpu(policy);

 Where is -exit() gone?


 Why should I exit? I don't think we need to. I'm not planning on exit()
 and init() every time an entire cluster is offlined or onlined. Just stop
 the governor and let if go quiet.

Okay, my driver is compiled as a module. I insert/remove it multiple
times. Only -init() will be called multiple times, no exit ?

  #ifdef CONFIG_SMP
 /* check whether a different CPU

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Saravana Kannan


On 07/11/2014 03:52 AM, Viresh Kumar wrote:

Just responding to one comment. The one about policy-cpu.




diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c



  static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
  {
-   unsigned int j;
+   unsigned int j, first_cpu = cpumask_first(policy-related_cpus);
 int ret = 0;

-   for_each_cpu(j, policy-cpus) {
+   for_each_cpu(j, policy-related_cpus) {
 struct device *cpu_dev;

-   if (j == policy-cpu)
+   if (j == first_cpu)


why?


The first CPU is a cluster always own the real nodes.


What I meant was, why not use policy-cpu?


+static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
  {
 struct freq_attr **drv_attr;
+   struct device *dev;
 int ret = 0;

+   dev = get_cpu_device(cpumask_first(policy-related_cpus));
+   if (!dev)
+   return -EINVAL;
+


Why?


I'm just always adding the real nodes to the first CPU in a cluster
independent of which CPU gets added first. Makes it easier to know which
ones to symlink. See comment next to policy-cpu for full context.


Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy-cpu the right CPU always?


No, the first cpu in a cluster doesn't need to be the first one to be 
added. An example is 2x2 cluster system where the system is booted with 
max cpus = 2 and then cpu3 could be onlined first by userspace.





-   if (has_target()) {
+   cpus = cpumask_weight(policy-cpus);
+   policy-cpu = cpumask_first(policy-cpus);


why update it at all? Also, as per your logic what if cpus == 0?


Yeah, I didn't write it this way at first. But the governors are making
the assumption that policy-cpu is always an online CPU. So, they try to


Are you sure? I had a quick look and failed to see that..


queue work there and use data structs of that CPU (even if they free it in
the STOP event since it went offline).


So, it queues work on all policy-cpus, not policy-cpu.
And the data structures
are just allocated with a CPU number, its fine if its offline.

And where are we freeing that stuff in STOP ?

Sorry if I am really really tired and couldn't read it correctly.


Yeah, it is pretty convolution. But pretty much anywhere in the gov code 
where policy-cpu is used could cause this. The specific crash I hit was 
in this code:


static void od_dbs_timer(struct work_struct *work)
{
struct od_cpu_dbs_info_s *dbs_info =
container_of(work, struct od_cpu_dbs_info_s, cdbs.work.work);
unsigned int cpu = dbs_info-cdbs.cur_policy-cpu;

=== CPU is policy-cpu here.

struct od_cpu_dbs_info_s *core_dbs_info = per_cpu(od_cpu_dbs_info,
cpu);

=== Picks the per CPU struct of an offline CPU

snip

mutex_lock(core_dbs_info-cdbs.timer_mutex);

=== Dies trying to lock a destroyed mutex




Another option is to leave policy-cpu unchanged and then fix all the
governors. But this patch would get even more complicated. So, we can
leave this as is, or fix that up in a separate patch.


Since we are simplifying it here, I think we should NOT change policy-cpu
at all. It will make life simple (probably).


I agree, but then I would have to fix up the governors. In the interest 
of keeping this patch small. I'll continue with what I'm doing and fix 
it up in another patch.


-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-11 Thread Saravana Kannan


On 07/10/2014 11:19 PM, Viresh Kumar wrote:



Please make sure you take care of these issues:
- suspend/resume
- hotplug
- module insert/remove
Ok, I was just at the current code. Does cpufreq_unregister_driver() 
even really work correctly as it stands?


It doesn't even seem to stop any of the governors/policies before it 
just set the cpufreq_driver pointer to NULL.


So, technically my v2 patch doesn't even make anything worse when it 
comes to unregistering the cpufreq driver.


Similar issues for unregister_governor too!

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-10 Thread Saravana Kannan

The CPUfreq driver moves the cpufreq policy ownership between CPUs when
CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
moving policy ownership between CPUs, it also moves the cpufreq sysfs
directory between CPUs and also fixes up the symlinks of the other CPUs in
the cluster.

Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
directories are deleted, the kobject is released and the policy is freed.
And when the first CPU in a cluster comes up, the policy is reallocated and
initialized, kobject is acquired, the sysfs nodes are created or symlinked,
etc.

All these steps end up creating unnecessarily complicated code and locking.
There's no real benefit to adding/removing/moving the sysfs nodes and the
policy between CPUs. Other per CPU sysfs directories like power and cpuidle
are left alone during hotplug. So there's some precedence to what this
patch is trying to do.

This patch simplifies a lot of the code and locking by removing the
adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
directory and policy in place irrespective of whether the CPUs are
ONLINE/OFFLINE.

Leaving the policy, sysfs and kobject in place also brings these additional
benefits:
* Faster suspend/resume.
* Faster hotplug.
* Sysfs file permissions maintained across hotplug without userspace
  workarounds.
* Policy settings and governor tunables maintained across suspend/resume
  and hotplug.
* Cpufreq stats would be maintained across hotplug for all CPUs and can be
  queried even after CPU goes OFFLINE.

Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
Tested-by: Stephen Boyd 
Signed-off-by: Saravana Kannan 
---

Preliminary testing has been done. cpufreq directories are getting created
properly. Online/offline of CPUs work. Policies remain unmodifiable from
userspace when all policy CPUs are offline.

Error handling code has NOT been updated.

I've added a bunch of FIXME comments next to where I'm not sure about the
locking in the existing code. I believe most of the try_lock's were present
to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
the sysfs entries are not touched after creating them, we should be able to
replace most/all of these try_lock's with a normal lock.

This patch has more room for code simplification, but I would like to get
some acks for the functionality and this code before I do further
simplification.

I should also be able to remove get_online_cpus() in the store function and
replace it with just a check for policy->governor_enabled. That should
theoretically reduce some contention between cpufreq stats check and
hotplug of unrelated CPUs.

Appreciate all the feedback.

Thanks,
Saravana

 drivers/cpufreq/cpufreq.c | 331 ++
 1 file changed, 69 insertions(+), 262 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 62259d2..e350b15 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct attribute 
*attr)
 }
 EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
 
-/* symlink affected CPUs */
+/* symlink related CPUs */
 static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
 {
-   unsigned int j;
+   unsigned int j, first_cpu = cpumask_first(policy->related_cpus);
int ret = 0;
 
-   for_each_cpu(j, policy->cpus) {
+   for_each_cpu(j, policy->related_cpus) {
struct device *cpu_dev;
 
-   if (j == policy->cpu)
+   if (j == first_cpu)
continue;
 
pr_debug("Adding link for CPU: %u\n", j);
@@ -881,12 +881,16 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy 
*policy)
return ret;
 }
 
-static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
-struct device *dev)
+static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
 {
struct freq_attr **drv_attr;
+   struct device *dev;
int ret = 0;
 
+   dev = get_cpu_device(cpumask_first(policy->related_cpus));
+   if (!dev)
+   return -EINVAL;
+
/* prepare interface data */
ret = kobject_init_and_add(>kobj, _cpufreq,
   >kobj, "cpufreq");
@@ -961,60 +965,53 @@ static void cpufreq_init_policy(struct cpufreq_policy 
*policy)
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
- unsigned int cpu, struct device *dev)
+static int cpufreq_change_policy_cpus(struct cpufreq_policy *policy,
+ unsigned int cpu, bool add)
 {
int ret = 0;
-   unsigned long flags;
+   unsigned int cpus;
 
-   if (has_target()) {
+   down_write(>rwsem);
+   cpus = cpumask_weight(policy->cpus);
+   if (has_target() && cpus > 0) {
ret =

[PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

2014-07-10 Thread Saravana Kannan

The CPUfreq driver moves the cpufreq policy ownership between CPUs when
CPUs within a cluster (CPUs sharing same policy) go ONLINE/OFFLINE. When
moving policy ownership between CPUs, it also moves the cpufreq sysfs
directory between CPUs and also fixes up the symlinks of the other CPUs in
the cluster.

Also, when all the CPUs in a cluster go OFFLINE, all the sysfs nodes and
directories are deleted, the kobject is released and the policy is freed.
And when the first CPU in a cluster comes up, the policy is reallocated and
initialized, kobject is acquired, the sysfs nodes are created or symlinked,
etc.

All these steps end up creating unnecessarily complicated code and locking.
There's no real benefit to adding/removing/moving the sysfs nodes and the
policy between CPUs. Other per CPU sysfs directories like power and cpuidle
are left alone during hotplug. So there's some precedence to what this
patch is trying to do.

This patch simplifies a lot of the code and locking by removing the
adding/removing/moving of policy/sysfs/kobj and just leaves the cpufreq
directory and policy in place irrespective of whether the CPUs are
ONLINE/OFFLINE.

Leaving the policy, sysfs and kobject in place also brings these additional
benefits:
* Faster suspend/resume.
* Faster hotplug.
* Sysfs file permissions maintained across hotplug without userspace
  workarounds.
* Policy settings and governor tunables maintained across suspend/resume
  and hotplug.
* Cpufreq stats would be maintained across hotplug for all CPUs and can be
  queried even after CPU goes OFFLINE.

Change-Id: I39c395e1fee8731880c0fd7c8a9c1d83e2e4b8d0
Tested-by: Stephen Boyd sb...@codeaurora.org
Signed-off-by: Saravana Kannan skan...@codeaurora.org
---

Preliminary testing has been done. cpufreq directories are getting created
properly. Online/offline of CPUs work. Policies remain unmodifiable from
userspace when all policy CPUs are offline.

Error handling code has NOT been updated.

I've added a bunch of FIXME comments next to where I'm not sure about the
locking in the existing code. I believe most of the try_lock's were present
to prevent a deadlock between sysfs lock and the cpufreq locks. Now that
the sysfs entries are not touched after creating them, we should be able to
replace most/all of these try_lock's with a normal lock.

This patch has more room for code simplification, but I would like to get
some acks for the functionality and this code before I do further
simplification.

I should also be able to remove get_online_cpus() in the store function and
replace it with just a check for policy-governor_enabled. That should
theoretically reduce some contention between cpufreq stats check and
hotplug of unrelated CPUs.

Appreciate all the feedback.

Thanks,
Saravana

 drivers/cpufreq/cpufreq.c | 331 ++
 1 file changed, 69 insertions(+), 262 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 62259d2..e350b15 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -859,16 +859,16 @@ void cpufreq_sysfs_remove_file(const struct attribute 
*attr)
 }
 EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
 
-/* symlink affected CPUs */
+/* symlink related CPUs */
 static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
 {
-   unsigned int j;
+   unsigned int j, first_cpu = cpumask_first(policy-related_cpus);
int ret = 0;
 
-   for_each_cpu(j, policy-cpus) {
+   for_each_cpu(j, policy-related_cpus) {
struct device *cpu_dev;
 
-   if (j == policy-cpu)
+   if (j == first_cpu)
continue;
 
pr_debug(Adding link for CPU: %u\n, j);
@@ -881,12 +881,16 @@ static int cpufreq_add_dev_symlink(struct cpufreq_policy 
*policy)
return ret;
 }
 
-static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
-struct device *dev)
+static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
 {
struct freq_attr **drv_attr;
+   struct device *dev;
int ret = 0;
 
+   dev = get_cpu_device(cpumask_first(policy-related_cpus));
+   if (!dev)
+   return -EINVAL;
+
/* prepare interface data */
ret = kobject_init_and_add(policy-kobj, ktype_cpufreq,
   dev-kobj, cpufreq);
@@ -961,60 +965,53 @@ static void cpufreq_init_policy(struct cpufreq_policy 
*policy)
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
- unsigned int cpu, struct device *dev)
+static int cpufreq_change_policy_cpus(struct cpufreq_policy *policy,
+ unsigned int cpu, bool add)
 {
int ret = 0;
-   unsigned long flags;
+   unsigned int cpus;
 
-   if (has_target()) {
+   down_write(policy-rwsem);
+   cpus = cpumask_weight(policy-cpus);
+   if

42 matches

Mail list logo