Re: [PATCH-v3 2/3] mfd: 88pm800: Allow configuration of interrupt clear method

2015-06-24 Thread Vaibhav Hiremath



On Thursday 25 June 2015 11:20 AM, Krzysztof Kozlowski wrote:

On 25.06.2015 14:44, Vaibhav Hiremath wrote:



On Thursday 25 June 2015 11:02 AM, Krzysztof Kozlowski wrote:

On 25.06.2015 14:26, Vaibhav Hiremath wrote:



On Thursday 25 June 2015 05:33 AM, Krzysztof Kozlowski wrote:

2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath
:

As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
(page 0) controls the method of clearing interrupt
status of 88pm800 family of devices;

 0: clear on read
 1: clear on write

This patch allows to configure this field, through DT.

Also, as suggested by "Lee Jones" renaming DT property and variable
field to appropriate name.

Signed-off-by: Zhao Ye 
Signed-off-by: Vaibhav Hiremath 










Yes,
Fair enough...

I see very little value in runtime configuration, why not just do it
only way (either read or write)?
I would prefer to just set it by default (during init), to clear irq on
write.


Hard-coding a default value, if board files are not present, looks OK to me.



This is how it will look, I will also update the binding information
with this.


hvaibhav@hvaibhav-ThinkPad-T440p:~/projects/mainline/linux$ git diff 
--cached

diff --git a/drivers/mfd/88pm800.c b/drivers/mfd/88pm800.c
index 0a417ac..e415a06 100644
--- a/drivers/mfd/88pm800.c
+++ b/drivers/mfd/88pm800.c
@@ -645,9 +645,8 @@ static int pm800_probe(struct i2c_client *client,
dev_err(>dev, "failed to allocaate 
memory\n");

return -ENOMEM;
}
-
-   pdata->irq_clr_on_wr = of_property_read_bool(np,
-   "marvell,irq-clr-on-write");
+   /* Setting irq clear method on write */
+   pdata->irq_clr_on_wr = true;
}

ret = pm80x_init(client);



Thanks,
Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 2/3] mfd: 88pm800: Allow configuration of interrupt clear method

2015-06-24 Thread Krzysztof Kozlowski
On 25.06.2015 14:44, Vaibhav Hiremath wrote:
> 
> 
> On Thursday 25 June 2015 11:02 AM, Krzysztof Kozlowski wrote:
>> On 25.06.2015 14:26, Vaibhav Hiremath wrote:
>>>
>>>
>>> On Thursday 25 June 2015 05:33 AM, Krzysztof Kozlowski wrote:
 2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath
 :
> As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
> (page 0) controls the method of clearing interrupt
> status of 88pm800 family of devices;
>
> 0: clear on read
> 1: clear on write
>
> This patch allows to configure this field, through DT.
>
> Also, as suggested by "Lee Jones" renaming DT property and variable
> field to appropriate name.
>
> Signed-off-by: Zhao Ye 
> Signed-off-by: Vaibhav Hiremath 

 It does not look like a property of the board. Instead it looks like a
 runtime configuration so it should not be part of DT bindings.

>>>
>>> Why do you say that?
>>>
>>> It is very well feature of 88PM860 device, where you can control irq
>>> clear operation (either read/write).
>>>
>>>
>>> Thanks,
>>> Vaibhav
>>>
 I understand that previously this was configured by platform data and
 now you want to move everything to DT. But this does not belong to
 DT...

>>>
>>> Thats not completely true.
>>> I think DT is the right place for this configuration.
>>
>> DT and its bindings describe the specific board or device. Let me quote:
>> <> structure and language for describing hardware.  More specifically, it
>> is a description of hardware that is readable by an operating system...>>
>>
>> Whether you clear interrupts by writing or reading is configured during
>> runtime and it is completely independent to wiring. Each board with
>> 88pm800 would allow both methods. So this is not a property of hardware
>> in the terms of open firmware. This is a runtime configuration.
>>
> 
> Yes,
> Fair enough...
> 
> I see very little value in runtime configuration, why not just do it
> only way (either read or write)?
> I would prefer to just set it by default (during init), to clear irq on
> write.

Hard-coding a default value, if board files are not present, looks OK to me.

Best regards,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] f2fs updates for v4.2

2015-06-24 Thread Jaegeuk Kim
On Thu, Jun 25, 2015 at 05:33:34AM +0100, Al Viro wrote:
> On Wed, Jun 24, 2015 at 08:42:02PM -0700, Linus Torvalds wrote:
> > On Wed, Jun 24, 2015 at 1:25 PM, Jaegeuk Kim  wrote:
> > >
> > > New features are:
> > >  o per-file encryption (e.g., ext4)
> > 
> > The new encrypted symlinks needed fixups for the changes that happened
> > meanwhile to the symlink handling. I did all that in my merge, and I
> > *think* I got it all right, but I would like you to check. In
> > particular, I hope you have a test-case and can actually give it a
> > whirl on that.
> > 
> > Al added to cc, just in case he could also check my merge resolution
> > of fs/f2fs/namei.c (the merge is commit cfcc0ad47f4c, I'll push it out
> > after I've finished the filesystem pulls)
> 
> FWIW, linux-next contains fixups for a bunch of such stuff,
> including f2fs one.  The only difference between your resolution and
> Stephen's fixup is
> static const char *f2fs_encrypted_follow_link(struct dentry *dentry,
> void **cookie)
> vs.
> static const char *f2fs_encrypted_follow_link(struct dentry *dentry, void 
> **cookie)
> 
> Said that, f2fs_symlink() looks odd - we create a directory entry *before*
> doing page_symlink().  And if it (or encryption) fails, I don't see anything
> that would remove that new directory entry.  What are we ending up with
> in such case?

Thanks Al,

Right, I missed merging the fix-up patch in linux-next into my pull-request.
At a glance, I think there is no problem; except 80 column width, though.

Also, agreed that I need to take a look at deleting the dentry to deal with that
failure case.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] ipc: Modify message queue accounting to reflect both total user data and auxiliary kernel data

2015-06-24 Thread Davidlohr Bueso
On Tue, 2015-06-23 at 00:25 +0200, Marcus Gelderie wrote:
> A while back, the message queue implementation in the kernel was
> improved to use btrees to speed up retrieval of messages (commit
> d6629859b36). The patch introducing the improved kernel handling of
> message queues (using btrees) has, as a by-product, changed the
> meaning of the QSIZE field in the pseudo-file created for the queue.
> Before, this field reflected the size of the user-data in the queue.
> Since, it also takes kernel data structures into account. For
> example, if 13 bytes of user data are in the queue, on my machine the
> file reports a size of 61 bytes.

Good catch, and a nice opportunity to make the mq manpage more specific
wrt to queue sizes.

[...]

> Reporting the size of the message queue in kernel has its merits, but
> doing so in the QSIZE field of the pseudo file corresponding to the
> queue is a breaking change, as mentioned above. This patch therefore
> returns the QSIZE  field to its original meaning. At the same time,
> it introduces a new field QKERSIZE that reflects the size of the queue
> in kernel (user data + kernel data).

Hmmm I'm not sure about this. What are the specific benefits of having
QKERSIZE? We don't export in-kernel data like this in any other ipc
(posix or sysv) mechanism, afaik. Plus, we do not compromise kernel data
structures like this, as we would break userspace if later we change
posix_msg_tree_node. So NAK to this.

I would just remove the extra
+   info->qsize += sizeof(struct posix_msg_tree_node);

bits from d6629859b36 (along with -stable v3.5), plus a patch updating
the manpage that this field only reflects user data.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of pinctrl-single for external device over I2C

2015-06-24 Thread Vaibhav Hiremath



On Thursday 25 June 2015 10:08 AM, Tony Lindgren wrote:

* Vaibhav Hiremath  [150624 10:12]:


I do not like this, as this is not HW feature, so DT may not be right
approach.

So I will dig more from either runtime or Compile time option to use
regmap_ Vs raw read/writes.


Can't you just check if the pinctrl node has compatible = "syscon"
property?

A compile time option won't work for sure. I don't know what you
would check at runtime as you do not know what the bus is behind
syscon.



Although, I haven't gone through syscon, but not sure whether syscon
would be useful.

As you rightly stated, we need to know the bus behind regmap.

Thanks,
Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 2/3] mfd: 88pm800: Allow configuration of interrupt clear method

2015-06-24 Thread Vaibhav Hiremath



On Thursday 25 June 2015 11:02 AM, Krzysztof Kozlowski wrote:

On 25.06.2015 14:26, Vaibhav Hiremath wrote:



On Thursday 25 June 2015 05:33 AM, Krzysztof Kozlowski wrote:

2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath
:

As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
(page 0) controls the method of clearing interrupt
status of 88pm800 family of devices;

0: clear on read
1: clear on write

This patch allows to configure this field, through DT.

Also, as suggested by "Lee Jones" renaming DT property and variable
field to appropriate name.

Signed-off-by: Zhao Ye 
Signed-off-by: Vaibhav Hiremath 


It does not look like a property of the board. Instead it looks like a
runtime configuration so it should not be part of DT bindings.



Why do you say that?

It is very well feature of 88PM860 device, where you can control irq
clear operation (either read/write).


Thanks,
Vaibhav


I understand that previously this was configured by platform data and
now you want to move everything to DT. But this does not belong to
DT...



Thats not completely true.
I think DT is the right place for this configuration.


DT and its bindings describe the specific board or device. Let me quote:
<>

Whether you clear interrupts by writing or reading is configured during
runtime and it is completely independent to wiring. Each board with
88pm800 would allow both methods. So this is not a property of hardware
in the terms of open firmware. This is a runtime configuration.



Yes,
Fair enough...

I see very little value in runtime configuration, why not just do it
only way (either read or write)?
I would prefer to just set it by default (during init), to clear irq on
write.


Thanks,
Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 1/3] mfd: 88pm800: Add device tree support

2015-06-24 Thread Krzysztof Kozlowski
On 25.06.2015 14:27, Vaibhav Hiremath wrote:
> 
> 
> On Thursday 25 June 2015 05:27 AM, Krzysztof Kozlowski wrote:
>> 2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath
>> :
>>> Add DT support to the 88pm800 driver, along with compatible
>>> field for it's sub-devices (rtc, onkey and regulator)
>>>
>>> Signed-off-by: Chao Xie 
>>> Signed-off-by: Vaibhav Hiremath 
>>> ---
>>>   drivers/mfd/88pm800.c | 25 +
>>>   1 file changed, 25 insertions(+)
>>>
>>> diff --git a/drivers/mfd/88pm800.c b/drivers/mfd/88pm800.c
>>> index 841717a..059f01a 100644
>>> --- a/drivers/mfd/88pm800.c
>>> +++ b/drivers/mfd/88pm800.c
>>> @@ -27,6 +27,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>>
>>>   /* Interrupt Registers */
>>>   #define PM800_INT_STATUS1  (0x05)
>>> @@ -121,6 +122,11 @@ static const struct i2c_device_id
>>> pm80x_id_table[] = {
>>>   };
>>>   MODULE_DEVICE_TABLE(i2c, pm80x_id_table);
>>>
>>> +static const struct of_device_id pm80x_of_match_table[] = {
>>> +   { .compatible = "marvell,88pm800", },
>>> +   {},
>>> +};
>>> +
>>>   static struct resource rtc_resources[] = {
>>>  {
>>>   .name = "88pm80x-rtc",
>>> @@ -133,6 +139,7 @@ static struct resource rtc_resources[] = {
>>>   static struct mfd_cell rtc_devs[] = {
>>>  {
>>>   .name = "88pm80x-rtc",
>>> +.of_compatible = "marvell,88pm80x-rtc",
>>>   .num_resources = ARRAY_SIZE(rtc_resources),
>>>   .resources = _resources[0],
>>>   .id = -1,
>>> @@ -151,6 +158,7 @@ static struct resource onkey_resources[] = {
>>>   static const struct mfd_cell onkey_devs[] = {
>>>  {
>>>   .name = "88pm80x-onkey",
>>> +.of_compatible = "marvell,88pm80x-onkey",
>>>   .num_resources = 1,
>>>   .resources = _resources[0],
>>>   .id = -1,
>>> @@ -160,6 +168,7 @@ static const struct mfd_cell onkey_devs[] = {
>>>   static const struct mfd_cell regulator_devs[] = {
>>>  {
>>>   .name = "88pm80x-regulator",
>>> +.of_compatible = "marvell,88pm80x-regulator",
>>>   .id = -1,
>>>  },
>>>   };
>>> @@ -544,8 +553,23 @@ static int pm800_probe(struct i2c_client *client,
>>>  int ret = 0;
>>>  struct pm80x_chip *chip;
>>>  struct pm80x_platform_data *pdata =
>>> dev_get_platdata(>dev);
>>> +   struct device_node *np = client->dev.of_node;
>>>  struct pm80x_subchip *subchip;
>>>
>>> +   if (!pdata && !np) {
>>> +   dev_err(>dev,
>>> +   "pm80x requires platform data or of_node\n");
>>> +   return -EINVAL;
>>> +   }
>>> +
>>> +   if (!pdata) {
>>> +   pdata = devm_kzalloc(>dev, sizeof(*pdata),
>>> GFP_KERNEL);
>>> +   if (!pdata) {
>>> +   dev_err(>dev, "failed to allocaate
>>> memory\n");
>>
>> Generic error message for ENOMEM is not needed. Just return ENOMEM and
>> the core code will print the error.
>>
>> Rest looks fine,
> 
> 
> Ok, will remove it.
> 
> Should I add your reviewed-by in V4 for this patch?

No, not yet. :)
I would put such tag in my reply if I had that intention.

Best regards,
Krzysztof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 2/3] mfd: 88pm800: Allow configuration of interrupt clear method

2015-06-24 Thread Krzysztof Kozlowski
On 25.06.2015 14:26, Vaibhav Hiremath wrote:
> 
> 
> On Thursday 25 June 2015 05:33 AM, Krzysztof Kozlowski wrote:
>> 2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath
>> :
>>> As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
>>> (page 0) controls the method of clearing interrupt
>>> status of 88pm800 family of devices;
>>>
>>>0: clear on read
>>>1: clear on write
>>>
>>> This patch allows to configure this field, through DT.
>>>
>>> Also, as suggested by "Lee Jones" renaming DT property and variable
>>> field to appropriate name.
>>>
>>> Signed-off-by: Zhao Ye 
>>> Signed-off-by: Vaibhav Hiremath 
>>
>> It does not look like a property of the board. Instead it looks like a
>> runtime configuration so it should not be part of DT bindings.
>>
> 
> Why do you say that?
> 
> It is very well feature of 88PM860 device, where you can control irq
> clear operation (either read/write).
> 
> 
> Thanks,
> Vaibhav
> 
>> I understand that previously this was configured by platform data and
>> now you want to move everything to DT. But this does not belong to
>> DT...
>>
> 
> Thats not completely true.
> I think DT is the right place for this configuration.

DT and its bindings describe the specific board or device. Let me quote:
<>

Whether you clear interrupts by writing or reading is configured during
runtime and it is completely independent to wiring. Each board with
88pm800 would allow both methods. So this is not a property of hardware
in the terms of open firmware. This is a runtime configuration.

Description of hardware would be a property which specifies whether a
88pm800-like device or a board using 88pm800 device ALLOWS choosing
different interrupt clearing.

Best regards,
Krzysztof


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: lzo: check for length overrun in variable length encoding backport

2015-06-24 Thread Willy Tarreau
Hi Florian,

On Wed, Jun 24, 2015 at 09:48:46PM -0700, Florian Fainelli wrote:
> Hi,
> 
> Could you backport this commit:
> 72cf90124e87d975d0b2114d930808c58b4c05e4 ("lzo: check for length overrun
> in variable length encoding.") into stable kernels older than 3.18?
> 
> It should apply cleanly to anything that contains
> 8b975bd3f9089f8ee5d7bbfd798537b992bbc7e7 ("lib/lzo: Update LZO
> compression to current upstream version") which goes as far as 3.9.

Well, it was merged into 3.10.59 as commit 9689415259, 3.12.32 as 4277fc42,
3.14.23 as 7f5f71a92.

Are you sure you didn't mean something else and confused it with another id ?

Thanks,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 1/3] mfd: 88pm800: Add device tree support

2015-06-24 Thread Vaibhav Hiremath



On Thursday 25 June 2015 05:27 AM, Krzysztof Kozlowski wrote:

2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath :

Add DT support to the 88pm800 driver, along with compatible
field for it's sub-devices (rtc, onkey and regulator)

Signed-off-by: Chao Xie 
Signed-off-by: Vaibhav Hiremath 
---
  drivers/mfd/88pm800.c | 25 +
  1 file changed, 25 insertions(+)

diff --git a/drivers/mfd/88pm800.c b/drivers/mfd/88pm800.c
index 841717a..059f01a 100644
--- a/drivers/mfd/88pm800.c
+++ b/drivers/mfd/88pm800.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
+#include 

  /* Interrupt Registers */
  #define PM800_INT_STATUS1  (0x05)
@@ -121,6 +122,11 @@ static const struct i2c_device_id pm80x_id_table[] = {
  };
  MODULE_DEVICE_TABLE(i2c, pm80x_id_table);

+static const struct of_device_id pm80x_of_match_table[] = {
+   { .compatible = "marvell,88pm800", },
+   {},
+};
+
  static struct resource rtc_resources[] = {
 {
  .name = "88pm80x-rtc",
@@ -133,6 +139,7 @@ static struct resource rtc_resources[] = {
  static struct mfd_cell rtc_devs[] = {
 {
  .name = "88pm80x-rtc",
+.of_compatible = "marvell,88pm80x-rtc",
  .num_resources = ARRAY_SIZE(rtc_resources),
  .resources = _resources[0],
  .id = -1,
@@ -151,6 +158,7 @@ static struct resource onkey_resources[] = {
  static const struct mfd_cell onkey_devs[] = {
 {
  .name = "88pm80x-onkey",
+.of_compatible = "marvell,88pm80x-onkey",
  .num_resources = 1,
  .resources = _resources[0],
  .id = -1,
@@ -160,6 +168,7 @@ static const struct mfd_cell onkey_devs[] = {
  static const struct mfd_cell regulator_devs[] = {
 {
  .name = "88pm80x-regulator",
+.of_compatible = "marvell,88pm80x-regulator",
  .id = -1,
 },
  };
@@ -544,8 +553,23 @@ static int pm800_probe(struct i2c_client *client,
 int ret = 0;
 struct pm80x_chip *chip;
 struct pm80x_platform_data *pdata = dev_get_platdata(>dev);
+   struct device_node *np = client->dev.of_node;
 struct pm80x_subchip *subchip;

+   if (!pdata && !np) {
+   dev_err(>dev,
+   "pm80x requires platform data or of_node\n");
+   return -EINVAL;
+   }
+
+   if (!pdata) {
+   pdata = devm_kzalloc(>dev, sizeof(*pdata), GFP_KERNEL);
+   if (!pdata) {
+   dev_err(>dev, "failed to allocaate memory\n");


Generic error message for ENOMEM is not needed. Just return ENOMEM and
the core code will print the error.

Rest looks fine,



Ok, will remove it.

Should I add your reviewed-by in V4 for this patch?

Thanks,
Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH-v3 2/3] mfd: 88pm800: Allow configuration of interrupt clear method

2015-06-24 Thread Vaibhav Hiremath



On Thursday 25 June 2015 05:33 AM, Krzysztof Kozlowski wrote:

2015-06-24 18:21 GMT+09:00 Vaibhav Hiremath :

As per the spec, bit 1 (INT_CLEAR_MODE) of reg addr 0xe
(page 0) controls the method of clearing interrupt
status of 88pm800 family of devices;

   0: clear on read
   1: clear on write

This patch allows to configure this field, through DT.

Also, as suggested by "Lee Jones" renaming DT property and variable
field to appropriate name.

Signed-off-by: Zhao Ye 
Signed-off-by: Vaibhav Hiremath 


It does not look like a property of the board. Instead it looks like a
runtime configuration so it should not be part of DT bindings.



Why do you say that?

It is very well feature of 88PM860 device, where you can control irq
clear operation (either read/write).


Thanks,
Vaibhav


I understand that previously this was configured by platform data and
now you want to move everything to DT. But this does not belong to
DT...



Thats not completely true.
I think DT is the right place for this configuration.

Thanks,
Vaibhav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2 1/1] usb:serial:f81534 Add F81532/534 Driver

2015-06-24 Thread Peter Hung
Hello Johan,

Peter Hung 於 2015/6/15 上午 09:54 寫道:
> This driver is for Fintek F81532/F81534 USB to Serial Ports IC.
> 
> Features:
> 1. F81534 is 1-to-4 & F81532 is 1-to-2 serial ports IC
> 2. Support Baudrate from B50 to B150 (excluding B100).
> 3. The RTS signal can be transformed their behavior with configuration
> for transceiver (for RS232/RS485/RS422) (/sys/class/ttyUSBx/uart_mode)
> 4. There are 4x3 output-only GPIOs to control transceiver mode. It's
> can be controlled via sysfs (/sys/class/ttyUSBx/gpio)
> 

Do you receive my patch?
Are there anything should I do to improve it ?

-- 
With Best Regards,
Peter Hung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] arm: boot: store ATAG structure into DT atags field

2015-06-24 Thread Tony Lindgren
* Arnd Bergmann  [150515 13:23]:
> On Friday 15 May 2015 22:16:24 Pali Rohár wrote:
> > On Friday 15 May 2015 22:12:41 Arnd Bergmann wrote:
> > > On Friday 15 May 2015 21:50:07 Pali Rohár wrote:
> > > > }
> > > > 
> > > > }
> > > > 
> > > > +   /* include the terminating ATAG_NONE */
> > > > +   atag_size = (char *)atag - (char *)atag_list +
> > > > sizeof(struct tag_header); +   setprop(fdt, "/", "atags",
> > > > atag_list, atag_size);
> > > > +
> > > > 
> > > > if (memcount) {
> > > > 
> > > > setprop(fdt, "/memory", "reg", mem_reg_property,
> > > > 
> > > > 4 * memcount * memsize);
> > > 
> > > The property should probably have a DT binding, and be named
> > > "linux,atags".
> > > 
> > > It may also help to check if the "linux,atags" property already
> > > exists and not create it otherwise. That way we can put it into the
> > > n900 dts file and have it updated by the compat code, but not expose
> > > the atags on other platforms unless they opt in.

Using "linux,atags" sounds good to me. And yes checking it with
getprop before doing setprop makes sense.

> > Maybe what would help: Is there a way to tell decompressor/kernel to not 
> > touch atag memory and then after kernel/board-code starts it save copy 
> > of atags? I think it is not possible right now, but correct me if I'm 
> > wrong...
> > 
> 
> I don't think that is possible without an incompatible change to the
> boot protocol.

Agreed, let's keep the changes to minimum.

Looks like with the comments posted all the pending four patches
from Pali become quite a minimal set of three patches if we keep
the rev string as hex.

Regrds,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] arm: devtree: Save atags if are in DT atags field

2015-06-24 Thread Tony Lindgren
* Arnd Bergmann  [150515 13:11]:
> On Friday 15 May 2015 21:50:06 Pali Rohár wrote:
> > @@ -256,5 +257,10 @@ const struct machine_desc * __init 
> > setup_machine_fdt(unsigned int dt_phys)
> > system_rev = 0;
> > }
> >  
> > +   /* Save atags */
> > +   prop = of_get_flat_dt_prop(dt_root, "atags", NULL);
> > +   if (prop)
> > +   save_atags((void *)prop);
> > +
> > return mdesc;
> > 
> 
> How about checking whether this is actually running on the one board
> that needs it first?
> 
> I'd rather not introduce something that may end up being considered
> an ABI on other machines.

It seems having this within CONFIG_ARM_ATAG_DTB_COMPAT should be
enough here.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND] [PATCH v2 1/2] arm: devtree: Set system_rev from DT revision

2015-06-24 Thread Tony Lindgren
* Pali Rohár  [150506 04:45]:
> On Wednesday 06 May 2015 13:04:01 Arnd Bergmann wrote:
> > > 
> > > It needs to be done in this code, so "system_rev" variable is set
> > > properly...
> > 
> > What I mean is which code accesses this variable that early?
> > 
> 
> ATAG code is doing it at same early stage, so I added it to same early
> stage...

Yes we should do this early like the other atags.
 
> > > > Also, it seems strange to have a string property and then use kstrtouint
> > > > to convert it into a number. I think it should either be specified in a 
> > > > DT
> > > > binding to be a string and then have the kernel not assume that it is a 
> > > > number,
> > > > or we should define it to be binary.
> > > > 
> > > >   Arnd
> > > 
> > > Variable "system_rev" is number and it always was. So chaning type will
> > > break more parts.
> > > 
> > > And it is string DT property to be human readable. Some other developers
> > > suggested for v2 to change it to string (from number).
> > 
> > Both of them would be human readable, you just use something else to
> > read them ;-)
> > 
> > If we have a string here, we should just change all uses of system_rev
> > in the kernel accordingly, there are only a few of them:

Let's just keep it as a hex as it was. After all it's an existing
interface in /proc that user space programs may expect to be in
hex format already.

Pali, care to repost the whole set again right after -rc1 with
with rev property naming and documentation added? Just keep it
as hex and let's forget any string conversion.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] ext4 changes for 4.2-rc1

2015-06-24 Thread Theodore Ts'o
Hi Linus,

Here's my suggested merge resolution to deal with Al Viro's symlink changes.

 - Ted

diff --cc fs/ext4/symlink.c
index ba5bd18,68e915a..000
--- a/fs/ext4/symlink.c
+++ b/fs/ext4/symlink.c
@@@ -35,19 -34,20 +34,17 @@@ static const char *ext4_follow_link(str
int res;
u32 plen, max_size = inode->i_sb->s_blocksize;
  
-   ctx = ext4_get_fname_crypto_ctx(inode, inode->i_sb->s_blocksize);
-   if (IS_ERR(ctx))
-   return ERR_CAST(ctx);
 -  if (!ext4_encrypted_inode(inode))
 -  return page_follow_link_light(dentry, nd);
 -
+   res = ext4_get_encryption_info(inode);
+   if (res)
+   return ERR_PTR(res);
  
if (ext4_inode_is_fast_symlink(inode)) {
caddr = (char *) EXT4_I(inode)->i_data;
max_size = sizeof(EXT4_I(inode)->i_data);
} else {
cpage = read_mapping_page(inode->i_mapping, 0, NULL);
-   if (IS_ERR(cpage)) {
-   ext4_put_fname_crypto_ctx();
+   if (IS_ERR(cpage))
 -  return cpage;
 +  return ERR_CAST(cpage);
-   }
caddr = kmap(cpage);
caddr[size] = 0;
}
@@@ -77,14 -78,13 +75,12 @@@
/* Null-terminate the name */
if (res <= plen)
paddr[res] = '\0';
-   ext4_put_fname_crypto_ctx();
 -  nd_set_link(nd, paddr);
if (cpage) {
kunmap(cpage);
page_cache_release(cpage);
}
 -  return NULL;
 +  return *cookie = paddr;
  errout:
-   ext4_put_fname_crypto_ctx();
if (cpage) {
kunmap(cpage);
page_cache_release(cpage);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RESUBMIT PATCH 1/1] arm/hw_breakpoint.c: remove unnecessary header

2015-06-24 Thread Maninder Singh
Header  is not needed for arm/hw_breakpoint.c,
Removing the same.

Signed-off-by: Maninder Singh 
Reviewed-by: Vaneet Narang 
---
 arch/arm/kernel/hw_breakpoint.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index dc7d0a9..6284779 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -35,7 +35,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 /* Breakpoint currently in use for each BRP. */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


lzo: check for length overrun in variable length encoding backport

2015-06-24 Thread Florian Fainelli
Hi,

Could you backport this commit:
72cf90124e87d975d0b2114d930808c58b4c05e4 ("lzo: check for length overrun
in variable length encoding.") into stable kernels older than 3.18?

It should apply cleanly to anything that contains
8b975bd3f9089f8ee5d7bbfd798537b992bbc7e7 ("lib/lzo: Update LZO
compression to current upstream version") which goes as far as 3.9.

Thanks!
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] staging : Comedi : comedi_fops : Fixed the return error code

2015-06-24 Thread Sudip Mukherjee
On Wed, Jun 24, 2015 at 11:22:24PM +0530, Santosh wrote:
>   try_module_get fails when the reference count of the module is not
>   allowed to be incremented ,and hence -ENXIO is returned indicating
>   no device or address.
1) this patch is 2/2, but then where is your 1/2 patch?
2) You have used "santosh" in your email From: header.
   use "santosh pai" there. It should be same as what you use in your
   Signed-off-by:

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Use of pinctrl-single for external device over I2C

2015-06-24 Thread Tony Lindgren
* Vaibhav Hiremath  [150624 10:12]:
> 
> I do not like this, as this is not HW feature, so DT may not be right
> approach.
> 
> So I will dig more from either runtime or Compile time option to use
> regmap_ Vs raw read/writes.

Can't you just check if the pinctrl node has compatible = "syscon"
property?

A compile time option won't work for sure. I don't know what you
would check at runtime as you do not know what the bus is behind
syscon.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] staging: rtl8192u: bool tests don't need comparisons

2015-06-24 Thread Sudip Mukherjee
On Wed, Jun 24, 2015 at 12:12:01PM +0200, Luis de Bethencourt wrote:
> On Wed, Jun 24, 2015 at 11:05:16AM +0530, Sudip Mukherjee wrote:
> > On Tue, Jun 23, 2015 at 03:10:56PM +0200, Luis de Bethencourt wrote:
> 
> I based the patch on staging's master and not on the staging-next branch.
use staging-testing branch.

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] tracing: Have filter check for balanced ops

2015-06-24 Thread Steven Rostedt
On Thu, 25 Jun 2015 00:03:02 -0400
Sasha Levin  wrote:

> On 06/17/2015 08:36 AM, Steven Rostedt wrote:
> > Linus,
> > 
> > Vince Weaver reported a warning when he added perf event filters
> > into his fuzzer tests. There's a missing check of balanced
> > operations when parenthesis are used, and this triggers a WARN_ON()
> > and when reading the failure, the filter reports no failure occurred.
> 
> Hey Steven,
> 
> My fuzzings are hitting the warning added by this patch:

Yes, Vince said he was able to hit it as well. But the warning itself
is useless if you don't supply what filter was used to trigger it.

-- Steve


> 
> [2175114.187536] WARNING: CPU: 16 PID: 10388 at 
> kernel/trace/trace_events_filter.c:1388 replace_preds+0x814/0x2140()
> [2175114.190213] Modules linked in:
> [2175114.19] CPU: 16 PID: 10388 Comm: trinity-c48 Not tainted 
> 4.1.0-next-20150623-sasha-00039-ga1eb83a-dirty #2280
> [2175114.194463]  880a2335 6a8e22d4 880a2335f878 
> abc8cfa3
> [2175114.196547]    880a2335f8c8 
> a21ebd36
> [2175114.198604]  880e60fe09e0 a24608f4 880e61b14830 
> 880e60fe09d8
> [2175114.200666] Call Trace:
> [2175114.201377]  [] dump_stack+0x4f/0x7b
> [2175114.202793]  [] warn_slowpath_common+0xc6/0x120
> [2175114.206235]  [] warn_slowpath_null+0x1a/0x20
> [2175114.207819]  [] replace_preds+0x814/0x2140
> [2175114.216433]  [] create_filter+0x15a/0x210
> [2175114.231529]  [] apply_event_filter+0x28b/0x780
> [2175114.241196]  [] event_filter_write+0x106/0x1c0
> [2175114.242823]  [] do_loop_readv_writev+0x128/0x1e0
> [2175114.248901]  [] do_readv_writev+0x5ae/0x6c0
> [2175114.256760]  [] vfs_writev+0x72/0xb0
> [2175114.258134]  [] SyS_pwritev+0x1b4/0x220
> [2175114.261291]  [] tracesys_phase2+0x88/0x8d
> 
> 
> Thanks,
> Sasha

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] f2fs updates for v4.2

2015-06-24 Thread Al Viro
On Wed, Jun 24, 2015 at 08:42:02PM -0700, Linus Torvalds wrote:
> On Wed, Jun 24, 2015 at 1:25 PM, Jaegeuk Kim  wrote:
> >
> > New features are:
> >  o per-file encryption (e.g., ext4)
> 
> The new encrypted symlinks needed fixups for the changes that happened
> meanwhile to the symlink handling. I did all that in my merge, and I
> *think* I got it all right, but I would like you to check. In
> particular, I hope you have a test-case and can actually give it a
> whirl on that.
> 
> Al added to cc, just in case he could also check my merge resolution
> of fs/f2fs/namei.c (the merge is commit cfcc0ad47f4c, I'll push it out
> after I've finished the filesystem pulls)

FWIW, linux-next contains fixups for a bunch of such stuff,
including f2fs one.  The only difference between your resolution and
Stephen's fixup is
static const char *f2fs_encrypted_follow_link(struct dentry *dentry,
  void **cookie)
vs.
static const char *f2fs_encrypted_follow_link(struct dentry *dentry, void 
**cookie)

Said that, f2fs_symlink() looks odd - we create a directory entry *before*
doing page_symlink().  And if it (or encryption) fails, I don't see anything
that would remove that new directory entry.  What are we ending up with
in such case?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][v2] asus-rbtn: new driver for asus radio button for Windows 8

2015-06-24 Thread Darren Hart
On Wed, Jun 24, 2015 at 10:57:51AM +0800, Alex Hung wrote:
> ASUS introduced a new approach to handle wireless hotkey
> since Windows 8.  When the hotkey is pressed, BIOS generates
> a notification 0x88 to a new ACPI device, ATK4001.  This
> new driver not only translates the notification to KEY_RFKILL
> but also toggles its LED accordingly.
> 
> Signed-off-by: Alex Hung 
> ---
>  MAINTAINERS  |   6 +
>  drivers/platform/x86/Kconfig |  11 ++
>  drivers/platform/x86/Makefile|   1 +
>  drivers/platform/x86/asus-rbtn.c | 240 
> +++
>  4 files changed, 258 insertions(+)
>  create mode 100644 drivers/platform/x86/asus-rbtn.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d8afd29..03711ce 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1673,6 +1673,12 @@ S: Maintained
>  F:   drivers/platform/x86/asus*.c
>  F:   drivers/platform/x86/eeepc*.c
>  
> +ASUS RADIO BUTTON DRIVER
> +M:   Alex Hung 
> +L:   platform-driver-...@vger.kernel.org
> +S:   Maintained
> +F:   drivers/platform/x86/asus-rbtn.c
> +
>  ASYNCHRONOUS TRANSFERS/TRANSFORMS (IOAT) API
>  R:   Dan Williams 
>  W:   http://sourceforge.net/projects/xscaleiop
> diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
> index f9f205c..a8ac885 100644
> --- a/drivers/platform/x86/Kconfig
> +++ b/drivers/platform/x86/Kconfig
> @@ -516,6 +516,17 @@ config EEEPC_LAPTOP
> If you have an Eee PC laptop, say Y or M here. If this driver
> doesn't work on your Eee PC, try eeepc-wmi instead.
>  
> +config ASUS_RBTN
> + tristate "ASUS radio button"
> + depends on ACPI
> + depends on INPUT
> + help
> +  This driver provides supports for new ASUS radio button for Windows 8.

s/supports/support/

Also, avoid using "new" in the Kconfig as this lives forever, in 10 years, it
won't be so new :-)

Consider:

"This driver supports the ASUS radio button for Windows 8."

(And maybe fix the entry for HP_WIRELESS while you're at it in a separate patch)

...

> +static int asus_rbtn_input_setup(void)
> +{
> + int err;
> +
> + asusrb_input_dev = input_allocate_device();
> + if (!asusrb_input_dev)
> + return -ENOMEM;
> +
> + asusrb_input_dev->name = "ASUS radio hotkeys";
> + asusrb_input_dev->phys = "atk4001/input0";
> + asusrb_input_dev->id.bustype = BUS_HOST;
> + asusrb_input_dev->evbit[0] = BIT(EV_KEY);
> + set_bit(KEY_RFKILL, asusrb_input_dev->keybit);
> +
> + err = input_register_device(asusrb_input_dev);
> + if (err)
> + goto err_free_dev;
> +
> + return 0;
> +
> +err_free_dev:
> + input_free_device(asusrb_input_dev);
> + return err;

I missed this on the first round. There is no need for a goto here at all:

int ret;
...
ret = input_register_Device(asusrb_input_dev);
if (ret)
input_free_device(asusrb_input_dev);
return ret;

Much nicer IMHO.

Do you have a strong preference for err over ret? In most cases in this driver,
ret would be the more typical choice in my experience.

I suppose this is modeled after hp-wireless which has the same error path in
hp_wireless_input_setup I mentioned above and uses err throughout - consistency
is a good thing. I won't argue over the ret/err thing as there is precedent in
this subsystem for similar drivers.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] tracing: Have filter check for balanced ops

2015-06-24 Thread Sasha Levin
On 06/17/2015 08:36 AM, Steven Rostedt wrote:
> Linus,
> 
> Vince Weaver reported a warning when he added perf event filters
> into his fuzzer tests. There's a missing check of balanced
> operations when parenthesis are used, and this triggers a WARN_ON()
> and when reading the failure, the filter reports no failure occurred.

Hey Steven,

My fuzzings are hitting the warning added by this patch:

[2175114.187536] WARNING: CPU: 16 PID: 10388 at 
kernel/trace/trace_events_filter.c:1388 replace_preds+0x814/0x2140()
[2175114.190213] Modules linked in:
[2175114.19] CPU: 16 PID: 10388 Comm: trinity-c48 Not tainted 
4.1.0-next-20150623-sasha-00039-ga1eb83a-dirty #2280
[2175114.194463]  880a2335 6a8e22d4 880a2335f878 
abc8cfa3
[2175114.196547]    880a2335f8c8 
a21ebd36
[2175114.198604]  880e60fe09e0 a24608f4 880e61b14830 
880e60fe09d8
[2175114.200666] Call Trace:
[2175114.201377]  [] dump_stack+0x4f/0x7b
[2175114.202793]  [] warn_slowpath_common+0xc6/0x120
[2175114.206235]  [] warn_slowpath_null+0x1a/0x20
[2175114.207819]  [] replace_preds+0x814/0x2140
[2175114.216433]  [] create_filter+0x15a/0x210
[2175114.231529]  [] apply_event_filter+0x28b/0x780
[2175114.241196]  [] event_filter_write+0x106/0x1c0
[2175114.242823]  [] do_loop_readv_writev+0x128/0x1e0
[2175114.248901]  [] do_readv_writev+0x5ae/0x6c0
[2175114.256760]  [] vfs_writev+0x72/0xb0
[2175114.258134]  [] SyS_pwritev+0x1b4/0x220
[2175114.261291]  [] tracesys_phase2+0x88/0x8d


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] net/fsl: remove dependency FSL_SOC for Gianfar

2015-06-24 Thread Alison Wang
CONFIG_GIANFAR is not depended on FSL_SOC, it
can be built on non-PPC platforms.

Signed-off-by: Alison Wang 
---
 drivers/net/ethernet/freescale/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/Kconfig 
b/drivers/net/ethernet/freescale/Kconfig
index b8de87b..ff76d4e 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -83,12 +83,12 @@ config UGETH_TX_ON_DEMAND
 
 config GIANFAR
tristate "Gianfar Ethernet"
-   depends on FSL_SOC
select FSL_PQ_MDIO
select PHYLIB
select CRC32
---help---
  This driver supports the Gigabit TSEC on the MPC83xx, MPC85xx,
- and MPC86xx family of chips, and the FEC on the 8540.
+ and MPC86xx family of chips, the eTSEC on LS1021A and the FEC
+ on the 8540.
 
 endif # NET_VENDOR_FREESCALE
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ext4 changes for 4.2-rc1

2015-06-24 Thread Theodore Ts'o
The following changes since commit e26081808edadfd257c6c9d81014e3b25e9a6118:

  Linux 4.1-rc4 (2015-05-18 10:13:47 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git 
tags/ext4_for_linus

for you to fetch changes up to a2fd66d069d86d793e9d39d4079b96f46d13f237:

  ext4: set lazytime on remount if MS_LAZYTIME is set by mount (2015-06-23 
11:03:54 -0400)


A very large number of cleanups and bug fixes --- in particular for
the ext4 encryption patches, which is a new feature added in the last
merge window.  Also fix a number of long-standing xfstest failures.
(Quota writes failing due to ENOSPC, a race between truncate and
writepage in data=journalled mode that was causing generic/068 to
fail, and other corner cases.)

Also add support for FALLOC_FL_INSERT_RANGE, and improve jbd2
performance eliminating locking when a buffer is modified more than
once during a transaction (which is very common for allocation
bitmaps, for example), in which case the state of the journalled
buffer head doesn't need to change.


Andreas Dilger (1):
  ext4: improve warning directory handling messages

Chao Yu (1):
  ext4 crypto: release crypto resource on module exit

Darrick J. Wong (1):
  ext4: don't retry file block mapping on bigalloc fs with non-extent file

David Moore (1):
  ext4: BUG_ON assertion repeated for inode1, not done for inode2

Dmitry Monakhov (1):
  jbd2: use GFP_NOFS in jbd2_cleanup_journal_tail()

Eric Whitney (2):
  ext4: minor cleanup of ext4_da_reserve_space()
  ext4: make online defrag error reporting consistent

Fabian Frederick (3):
  ext4 crypto: fix sparse warnings in fs/ext4/ioctl.c
  ext4: use swap() in memswap()
  ext4: use swap() in mext_page_double_lock()

Jan Kara (5):
  jbd2: simplify code flow in do_get_write_access()
  jbd2: simplify error path on allocation failure in do_get_write_access()
  jbd2: more simplifications in do_get_write_access()
  jbd2: speedup jbd2_journal_get_[write|undo]_access()
  jbd2: speedup jbd2_journal_dirty_metadata()

Josef Bacik (1):
  ext4: only call ext4_truncate when size <= isize

Joseph Qi (1):
  jbd2: fix ocfs2 corrupt when updating journal superblock fails

Lukas Czerner (5):
  ext4: verify block bitmap even after fresh initialization
  ext4: try to initialize all groups we can in case of failure on ppc64
  ext4: return error code from ext4_mb_good_group()
  ext4: recalculate journal credits as inode depth changes
  ext4: wait for existing dio workers in ext4_alloc_file_blocks()

Michal Hocko (2):
  jbd2: revert must-not-fail allocation loops back to GFP_NOFAIL
  jbd2: get rid of open coded allocation retry loop

Namjae Jeon (1):
  ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate

Rasmus Villemoes (1):
  ext4: mballoc: avoid 20-argument function call

Theodore Ts'o (26):
  ext4 crypto: optimize filename encryption
  ext4 crypto: don't allocate a page when encrypting/decrypting file names
  ext4 crypto: separate kernel and userspace structure for the key
  ext4 crypto: reorganize how we store keys in the inode
  ext4: clean up superblock encryption mode fields
  ext4 crypto: use slab caches
  ext4 crypto: get rid of ci_mode from struct ext4_crypt_info
  ext4 crypto: shrink size of the ext4_crypto_ctx structure
  ext4 crypto: require CONFIG_CRYPTO_CTR if ext4 encryption is enabled
  ext4 crypto: use per-inode tfm structure
  ext4 crypto: fix memory leaks in ext4_encrypted_zeroout
  ext4 crypto: set up encryption info for new inodes in 
ext4_inherit_context()
  ext4 crypto: make sure the encryption info is initialized on opendir(2)
  ext4 crypto: encrypt tmpfile located in encryption protected directory
  ext4 crypto: enforce crypto policy restrictions on cross-renames
  ext4 crypto: policies may only be set on directories
  ext4 crypto: clean up error handling in ext4_fname_setup_filename
  ext4 crypto: allocate the right amount of memory for the on-disk symlink
  ext4 crypto: handle unexpected lack of encryption keys
  ext4 crypto: allocate bounce pages using GFP_NOWAIT
  ext4 crypto: fix ext4_get_crypto_ctx()'s calling convention in 
ext4_decrypt_one
  ext4 crypto: fail the mount if blocksize != pagesize
  ext4: fix race between truncate and __ext4_journalled_writepage()
  ext4: call sync_blockdev() before invalidate_bdev() in put_super()
  ext4: prevent ext4_quota_write() from failing due to ENOSPC
  ext4: set lazytime on remount if MS_LAZYTIME is set by mount

 fs/ext4/Kconfig |   1 +
 fs/ext4/balloc.c|   4 +-
 fs/ext4/crypto.c| 211 +--
 fs/ext4/crypto_fname.c  | 490 

Re: [GIT PULL] f2fs updates for v4.2

2015-06-24 Thread Linus Torvalds
On Wed, Jun 24, 2015 at 1:25 PM, Jaegeuk Kim  wrote:
>
> New features are:
>  o per-file encryption (e.g., ext4)

The new encrypted symlinks needed fixups for the changes that happened
meanwhile to the symlink handling. I did all that in my merge, and I
*think* I got it all right, but I would like you to check. In
particular, I hope you have a test-case and can actually give it a
whirl on that.

Al added to cc, just in case he could also check my merge resolution
of fs/f2fs/namei.c (the merge is commit cfcc0ad47f4c, I'll push it out
after I've finished the filesystem pulls)

Thanks,
   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] trace-cmd: add option to group like comms for profile

2015-06-24 Thread Steven Rostedt
On Thu, 21 May 2015 13:30:08 -0400
Josef Bacik  wrote:


> +static void merge_tasks(struct handle_data *h)
> +{
> + struct trace_hash_item **bucket;
> + struct trace_hash_item *item;
> +
> + if (!merge_like_comms)
> + return;
> +
> + trace_hash_for_each_bucket(bucket, >task_hash) {
> + trace_hash_for_each_item(item, bucket)
> + add_group(h, task_from_item(item));
> + }
> +}
> +
>  int trace_profile(void)
>  {
>   struct handle_data *h;
>  
>   for (h = handles; h; h = h->next) {
> + if (merge_like_comms)
> + merge_tasks(h);

I don't think we need the double check. Here you only call
merge_tasks() if merge_like_comms is set, but then the first thing you
do in merge_tasks() is to return if merge_like_comms is not set. One
check is enough.

-- Steve


>   output_handle(h);
>   trace_hash_free(>task_hash);
>   }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] sched: might_sleep(): do rate-limiting before sanity checks

2015-06-24 Thread Dave Hansen
On 06/24/2015 05:03 PM, Dave Hansen wrote:
> In any case, we ratelimit might_sleep() checks anyway.  But, we
> do the ratelimiting *after* we check the other conditions for
> might_sleep() including the (costly) irqs_disabled() call.

Thinking about this a bit more, this patch is wrong.

This only does a _check_ once per jiffy instead of just one warning per
jiffy, which is totally bogus.

I would be interested, though, if anybody has any ideas about speeding
up the irqs_disabled() checking.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

2015-06-24 Thread Preeti U Murthy
On 06/25/2015 05:36 AM, Rafael J. Wysocki wrote:
> On Thu, Jun 25, 2015 at 12:06 AM, Benjamin Herrenschmidt
>  wrote:
>> On Wed, 2015-06-24 at 15:50 +0200, Rafael J. Wysocki wrote:
>>> 4.2 material I suppose?
>>
>> And stable. Without this, if you configure without TICK_ONESHOT, the
>> machine will hang.
> 
> OK, which -stable?  All of them or any specific series?

This needs to go into stable/linux-3.19.y,
stable/linux-4.0.y, stable/linux-4.1.y.

Thanks

Regards
Preeti U Murthy
> 
> Rafael
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] trace-cmd: add option to group like comms for profile

2015-06-24 Thread Steven Rostedt
On Thu, 21 May 2015 13:30:08 -0400
Josef Bacik  wrote:


> +static int compare_groups(const void *a, const void *b)
> +{
> + const char *A = a;
> + const char *B = b;
> +
> + return strcmp(A, B);

a and b are not strings here. They are group_data pointers.

I think what you want is this:

static int compare_groups(const void *a, const void *b)
{
struct group_data * const *A = a;
struct group_data * const *B = b;

return strcmp((*A)->comm, (*B)->comm);
}

-- Steve

> +}
> +
> 


> +static void output_groups(struct handle_data *h)
> +{
> + struct trace_hash_item **bucket;
> + struct trace_hash_item *item;
> + struct group_data **groups;
> + int nr_groups = 0;
> + int i;
> +
> + trace_hash_for_each_bucket(bucket, >group_hash) {
> + trace_hash_for_each_item(item, bucket) {
> + nr_groups++;
> + }
> + }
> +
> + if (nr_groups == 0)
> + return;
> +
> + groups = malloc_or_die(sizeof(*groups) * nr_groups);
> +
> + nr_groups = 0;
> +
> + trace_hash_for_each_bucket(bucket, >group_hash) {
> + trace_hash_while_item(item, bucket) {
> + groups[nr_groups++] = group_from_item(item);
> + trace_hash_del(item);
> + }
> + }
> +
> + qsort(groups, nr_groups, sizeof(*groups), compare_groups);
> +
> + for (i = 0; i < nr_groups; i++) {
> + output_group(h, groups[i]);
> + free_group(groups[i]);
> + }
> +
> + free(groups);
> +}
> +
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kexec: Make a pair of reserved pages when kdump fails to start

2015-06-24 Thread Minfei Huang
From: Minfei Huang 

For some arch, kexec shall map the reserved pages, then use them, when
we try to start the kdump service.

Now kexec will never unmap the reserved pages, once it fails to continue
starting the kdump service.

Make a pair of reserved pages in kdump starting path, whatever kexec
fails or not.

Signed-off-by: Minfei Huang 
---
 kernel/kexec.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index 7a36fdc..ab32d59 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1308,19 +1308,23 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, 
unsigned long, nr_segments,
image->preserve_context = 1;
result = machine_kexec_prepare(image);
if (result)
-   goto out;
+   goto failure;
 
for (i = 0; i < nr_segments; i++) {
result = kimage_load_segment(image, >segment[i]);
if (result)
-   goto out;
+   goto failure;
}
kimage_terminate(image);
+
+failure:
if (flags & KEXEC_ON_CRASH)
crash_unmap_reserved_pages();
}
-   /* Install the new kernel, and  Uninstall the old */
-   image = xchg(dest_image, image);
+
+   if (result == 0)
+   /* Install the new kernel, and  Uninstall the old */
+   image = xchg(dest_image, image);
 
 out:
mutex_unlock(_mutex);
-- 
2.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dell-laptop: Fix allocating & freeing SMI buffer page

2015-06-24 Thread Darren Hart
On Tue, Jun 23, 2015 at 10:11:19AM +0200, Pali Rohár wrote:
> This commit fix kernel crash when probing for rfkill devices in dell-laptop
> driver failed. Function free_page() was incorrectly used on struct page *
> instead of virtual address of SMI buffer.
> 
> This commit also simplify allocating page for SMI buffer by using
> __get_free_page() function instead of sequential call of functions
> alloc_page() and page_address().
> 
> Signed-off-by: Pali Rohár 
> Acked-by: Michal Hocko 
> Cc: sta...@vger.kernel.org

Queued, thanks Pali.

-- 
Darren Hart
Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 12/13] stop_machine: Remove lglock

2015-06-24 Thread Paul E. McKenney
On Wed, Jun 24, 2015 at 07:58:30PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 24, 2015 at 10:10:17AM -0700, Paul E. McKenney wrote:
> > > The thing is, once you start bailing on this condition your 'queue'
> > > drains very fast and this is around the same time sync_rcu() would've
> > > released the waiters too.
> > 
> > In my experience, this sort of thing simply melts down on large systems.
> > I am reworking this with multiple locks so as to keep the large-system
> > contention down to a dull roar.
> 
> So with the MCS queue we're got less global trashing than you had with
> the start/done tickets. Only the queue head on enqueue.

Here is what I had in mind, where you don't have any global trashing
except when the ->expedited_sequence gets updated.  Passes mild rcutorture
testing.

Still needs asynchronous CPU stoppage and stall warnings and trace
documentation updates.  Plus fixes for whatever bugs show up.

Thanx, Paul



diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 78d0a87ff354..887370b7e52a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -70,6 +70,7 @@ MODULE_ALIAS("rcutree");
 
 static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
+static struct lock_class_key rcu_exp_class[RCU_NUM_LVLS];
 
 /*
  * In order to export the rcu_state name to the tracing tools, it
@@ -3323,6 +3324,22 @@ static int synchronize_sched_expedited_cpu_stop(void 
*data)
return 0;
 }
 
+/* Common code for synchronize_sched_expedited() work-done checking. */
+static bool sync_sched_exp_wd(struct rcu_state *rsp, struct rcu_node *rnp,
+ atomic_long_t *stat, unsigned long s)
+{
+   if (ULONG_CMP_GE(READ_ONCE(rsp->expedited_sequence), s)) {
+   if (rnp)
+   mutex_unlock(>exp_funnel_mutex);
+   /* Ensure test happens before caller kfree(). */
+   smp_mb__before_atomic(); /* ^^^ */
+   atomic_long_inc(stat);
+   put_online_cpus();
+   return true;
+   }
+   return false;
+}
+
 /**
  * synchronize_sched_expedited - Brute-force RCU-sched grace period
  *
@@ -3334,58 +3351,24 @@ static int synchronize_sched_expedited_cpu_stop(void 
*data)
  * restructure your code to batch your updates, and then use a single
  * synchronize_sched() instead.
  *
- * This implementation can be thought of as an application of ticket
- * locking to RCU, with sync_sched_expedited_started and
- * sync_sched_expedited_done taking on the roles of the halves
- * of the ticket-lock word.  Each task atomically increments
- * sync_sched_expedited_started upon entry, snapshotting the old value,
- * then attempts to stop all the CPUs.  If this succeeds, then each
- * CPU will have executed a context switch, resulting in an RCU-sched
- * grace period.  We are then done, so we use atomic_cmpxchg() to
- * update sync_sched_expedited_done to match our snapshot -- but
- * only if someone else has not already advanced past our snapshot.
- *
- * On the other hand, if try_stop_cpus() fails, we check the value
- * of sync_sched_expedited_done.  If it has advanced past our
- * initial snapshot, then someone else must have forced a grace period
- * some time after we took our snapshot.  In this case, our work is
- * done for us, and we can simply return.  Otherwise, we try again,
- * but keep our initial snapshot for purposes of checking for someone
- * doing our work for us.
- *
- * If we fail too many times in a row, we fall back to synchronize_sched().
+ * This implementation can be thought of as an application of sequence
+ * locking to expedited grace periods, but using the sequence counter to
+ * determine when someone else has already done the work instead of for
+ * retrying readers.
  */
 void synchronize_sched_expedited(void)
 {
-   cpumask_var_t cm;
-   bool cma = false;
int cpu;
-   long firstsnap, s, snap;
-   int trycount = 0;
+   long s;
struct rcu_state *rsp = _sched_state;
+   struct rcu_node *rnp0;
+   struct rcu_node *rnp1 = NULL;
 
-   /*
-* If we are in danger of counter wrap, just do synchronize_sched().
-* By allowing sync_sched_expedited_started to advance no more than
-* ULONG_MAX/8 ahead of sync_sched_expedited_done, we are ensuring
-* that more than 3.5 billion CPUs would be required to force a
-* counter wrap on a 32-bit system.  Quite a few more CPUs would of
-* course be required on a 64-bit system.
-*/
-   if (ULONG_CMP_GE((ulong)atomic_long_read(>expedited_start),
-(ulong)atomic_long_read(>expedited_done) +
-ULONG_MAX / 8)) {
-   wait_rcu_gp(call_rcu_sched);
-   atomic_long_inc(>expedited_wrap);
-   return;
- 

Re: [PATCH V2] trace-cmd: add option to group like comms for profile

2015-06-24 Thread Steven Rostedt
On Thu, 21 May 2015 13:30:08 -0400
Josef Bacik  wrote:


> V1->V2:
> -renamed the option to --by-comm, added it to trace-cmd report --profile as 
> well
> -fixed up the string hash

Or break it ;-)

> -changed it to merge all events after the fact so it's less error prone

> diff --git a/trace-hash-local.h b/trace-hash-local.h
> index 997b11c..eaeeaaf 100644
> --- a/trace-hash-local.h
> +++ b/trace-hash-local.h
> @@ -48,4 +48,13 @@ static inline unsigned int trace_hash(int val)
>   return hash;
>  }
>  
> +static inline unsigned int trace_hash_str(char *str)
> +{
> + int val = 0;
> + int i;
> +
> + for (i = 0; str[i]; i++)
> + val += ((int)str[i]) << (i & 0xff);
> + return trace_hash(val);
> +}
>

I need to clean out my medicine cabinet and remove all the expired
meds. Because I was definitely taking something nasty when I
recommended (i & 0xff)!

When i is greater than 32 (which is less than 0xff) it will
overflow the addition. What I wanted was that we don't shift more than
24 bits. Where 2 ** 24 - 1 == 0xff.

That should be:

val += ((int)str[i]) << (i % 25);

my bad :-/


To avoid the slow '%', I'll just use '& 0xf', as shifting it 16 times is
enough for this algorithm.

No need to send a new patch, I'll fix it, as it was my brain
fart. I'll review the rest of this patch too and apply it if nothing
sticks out.

Thanks!

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] blk-mq: establish new mapping before cpu starts handling requests

2015-06-24 Thread Akinobu Mita
2015-06-25 1:24 GMT+09:00 Ming Lei :
> On Wed, Jun 24, 2015 at 10:34 PM, Akinobu Mita  wrote:
>> Hi Ming,
>>
>> 2015-06-24 18:46 GMT+09:00 Ming Lei :
>>> On Sun, Jun 21, 2015 at 9:52 PM, Akinobu Mita  
>>> wrote:
 ctx->index_hw is zero for the CPUs which have never been onlined since
 the block queue was initialized.  If one of those CPUs is hotadded and
 starts handling request before new mappings are established, pending
>>>
>>> Could you explain a bit what the handling request is? The fact is that
>>> blk_mq_queue_reinit() is run after all queues are put into freezing.
>>
>> Notifier callbacks for CPU_ONLINE action can be run on the other CPU
>> than the CPU which was just onlined.  So it is possible for the
>> process running on the just onlined CPU to insert request and run
>> hw queue before blk_mq_queue_reinit_notify() is actually called with
>> action=CPU_ONLINE.
>
> You are right because blk_mq_queue_reinit_notify() is alwasy run after
> the CPU becomes UP, so there is a tiny window in which the CPU is up
> but the mapping is updated.  Per current design, the CPU just onlined
> is still mapped to hw queue 0 until the mapping is updated by
> blk_mq_queue_reinit_notify().
>
> But I am wondering why it is a problem and why you think flush_busy_ctxs
> can't find the requests on the software queue in this situation?

The problem happens when the CPU has just been onlined first time
since the request queue was initialized.  At this time ctx->index_hw
for the CPU is still zero before blk_mq_queue_reinit_notify is called.

The request can be inserted to ctx->rq_list, but blk_mq_hctx_mark_pending()
marks busy for wrong bit position as ctx->index_hw is zero.

flush_busy_ctxs() only retrieves the requests from software queues
which are marked busy.  So the request just inserted is ignored as
the corresponding bit position is not busy.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/3] restartable sequences: fast user-space percpu critical sections

2015-06-24 Thread Paul Turner
On Wed, Jun 24, 2015 at 5:07 PM, Andy Lutomirski  wrote:
> On Wed, Jun 24, 2015 at 3:26 PM, Paul Turner  wrote:
>> This is a fairly small series demonstrating a feature we've found to be quite
>> powerful in practice, "restartable sequences".
>>
>
> On an extremely short glance, I'm starting to think that the right
> approach, at least for x86, is to implement per-cpu gsbase.  Then you
> could do cmpxchg with a gs prefix to atomically take a percpu lock and
> atomically release a percpu lock and check whether someone else stole
> the lock from you.  (Note: cmpxchg, unlike lock cmpxchg, is very
> fast.)
>
> This is totally useless for other architectures, but I think it would
> be reasonable clean on x86.  Thoughts?

So this gives semantics that are obviously similar to this_cpu().
This provides allows reasonable per-cpu counters (which is alone
almost sufficient for a strong user-space RCU implementation giving
this some legs).

However, unless there's a nice implementation trick I'm missing, the
thing that stands out to me for locks (or other primitives) is that
this forces a two-phase commit.  There's no way (short of say,
cmpxchg16b) to perform a write conditional on the lock not having been
stolen from us (and subsequently release the lock).

e.g.
 1) We take the operation in some sort of speculative mode, that
another thread on the same cpu is stilled allowed to steal from us
 2) We prepare what we want to commit
 3) At this point we have to promote the lock taken in (1) to perform
our actual commit, or see that someone else has stolen (1)
 4) Release the promoted lock in (3)

However, this means that if we're preempted at (3) then no other
thread on that cpu can make progress until we've been rescheduled and
released the lock; a nice property of the model we have today is that
threads sharing a cpu can not impede each other beyond what the
scheduler allows.

A lesser concern, but worth mentioning, is that there are also
potential pitfalls in the interaction with signal handlers,
particularly if a 2-phase commit is used.

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] trace-cmd: add a kernel memory leak detector

2015-06-24 Thread Steven Rostedt
On Tue, 23 Jun 2015 16:06:39 -0700
Josef Bacik  wrote:

> I needed to track down a very slow memory leak so I adapted the same approach
> trace-cmd profile uses to track kernel memory allocations.  You run this with
> 
> trace-cmd kmemleak

Note, I'm still playing with this.

> 
> and then you can kill -SIGUSR2  to get current status updates, 
> or
> just stop the process when you are ready.  It will tell you how much was lost
> and the size of the objects that were allocated, along with the tracebacks and
> the counts of the allocators.  Thanks,
> 
> Signed-off-by: Josef Bacik 


> diff --git a/trace-kmemleak.c b/trace-kmemleak.c
> new file mode 100644
> index 000..2e288fe
> --- /dev/null
> +++ b/trace-kmemleak.c

Please add a copyright notice here. You can add your name as author but
more importantly, please state what license this is under. See
trace-record.c for details.

> @@ -0,0 +1,552 @@
> +#define _LARGEFILE64_SOURCE
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "trace-local.h"
> +#include "trace-hash.h"
> +#include "list.h"
> +
> +#define memory_from_item(item)   container_of(item, struct memory, hash)
> +#define memory_from_phash(item)  container_of(item, struct memory, phash)
> +#define leak_from_item(item) container_of(item, struct memory_leak, hash)
> +#define edata_from_item(item)container_of(item, struct event_data, 
> hash)
> +#define stack_from_item(item)container_of(item, struct stack_trace, 
> hash)
> +



> --- a/trace-record.c
> +++ b/trace-record.c

> @@ -3703,6 +3721,7 @@ static void add_hook(struct buffer_instance *instance, 
> const char *arg)
>  }
>  
>  enum {
> + OPT_kmemleak= 249,
>   OPT_bycomm  = 250,

Ug, I realized I never applied your bycomm patch.

I'll need to look at that now too. Egad, I've been putting off
trace-cmd for to long. I need to start getting back to it!

-- Steve

>   OPT_stderr  = 251,
>   OPT_profile = 252,
> @@ -3738,7 +3757,7 @@ void trace_record (int argc, char **argv)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] mtd: docg3: Don't do ERR_PTR(0)

2015-06-24 Thread Brian Norris
Hi Robert,

On Tue, Jun 23, 2015 at 10:41:33PM +0200, Robert Jarzmik wrote:
> Richard Weinberger  writes:
> 
> > Am 17.06.2015 um 20:41 schrieb Brian Norris:
> >> Have you tested this patch?
> >
> > nah, I don't own such a device.
> But I do. If you resend a patch, please Cc me. You can even ask for a test 
> from
> time to time if you want a confirmation, I have a 2 floors docg3 device.

Do you want to be on a MAINTAINERS entry for this driver?

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Linus Torvalds
On Wed, Jun 24, 2015 at 7:14 PM, Steven Rostedt  wrote:
>
> I don't think it will complicate things even if the API changes. The distros
> will have to deal with that fall out. Mainline only cares about its own
> regressions. But any API changes would only be done for good reasons, and give
> the distros an excuse to fix whatever was done wrong in the first place.

I don't think that's true.

Realistically, every single kernel developer tends to work on a
machine with some random distro. If that developer cannot compile his
own kernel because his distro stops working, or has to use some
"kdbus=0" switch to turn off the kernel kdbus and (hopefuly) the
distro just switches to the legacy user mode bus, then for that
developer, merging and enabling incompatible kdbus implementation is
basically a regression.

We've seen this before. We end up stuck with the ABI of whatever user
land applications. It doesn't matter where that ABI came from.

I do agree that distro's that want to enable kdbus before any agreed
version has been merged would get to also act as guinea pigs and do
their own QA, and handle fallout from whatever problems they encounter
etc. That part might be good. But I don't think we really end up
having the option to make up some incompatible kdbus ABI
after-the-fact.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kdbus: to merge or not to merge?

2015-06-24 Thread Steven Rostedt
On Tue, Jun 23, 2015 at 08:07:41AM -0700, Andy Lutomirski wrote:
> 
> FWIW, once there are real distros with kdbus userspace enabled,
> reviewing kdbus gets more complicated -- we'll be in the position
> where merging kdbus in a different form from that which was proposed
> will break existing users.

Actually, I think distros having it in their kernel before it's in mainline is
actually a good thing. Let them straighten out the issues that may come up
(not to mention possible CVEs). If the distros have it in their kernels and
out in the public for 6 months or more, that may give enough information as to
whether or not it should be merged.

I don't think it will complicate things even if the API changes. The distros
will have to deal with that fall out. Mainline only cares about its own
regressions. But any API changes would only be done for good reasons, and give
the distros an excuse to fix whatever was done wrong in the first place.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

2015-06-24 Thread Wu, Feng


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Thursday, June 25, 2015 3:49 AM
> To: Eric Auger
> Cc: Joerg Roedel; Avi Kivity; Wu, Feng; k...@vger.kernel.org;
> linux-kernel@vger.kernel.org; pbonz...@redhat.com; mtosa...@redhat.com
> Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> 
> On Wed, 2015-06-24 at 18:25 +0200, Eric Auger wrote:
> > Hi Joerg,
> >
> > On 06/24/2015 05:50 PM, Joerg Roedel wrote:
> > > On Mon, Jun 15, 2015 at 06:17:03PM +0200, Eric Auger wrote:
> > >> I guess this discussion also is relevant wrt "[RFC v6 00/16] KVM-VFIO
> > >> IRQ forward control" series? Or is that "central registry maintained by
> > >> a posted interrupts manager" something more specific to x86?
> > >
> > > From what I understood so far, the feature you implemented for ARM is a
> > > bit different from the ones that get introduced to x86.
> > >
> > > Can you please share some details on how the ARM version works? I am
> > > interested in how the GICv2 is configured for IRQ forwarding. The
> > > question is whether the forwarding information needs to be updated from
> > > KVM and what information about the IRQ KVM needs for this.
> >
> > The principle is that when you inject a virtual IRQ to a guest, you
> > program a register in the GIC, known as a list register. There you put
> > both the virtual IRQ you want to inject but also the physical IRQ it is
> > linked with (HWbit mode set = forwarding set). When the guest completes
> > the virtual IRQ the GIC HW automatically deactivates the physical IRQ
> > found in the list register. In that mode the physical IRQ deactivation
> > is under the ownership of the guest (actually automatically done by the HW).
> >
> > If HWbit mode is *not* set (forwarding not set), you do not specify the
> > HW IRQ in the list register. The host deactivates the physical IRQ &
> > masks it before triggering the virtual IRQ. Only the virtual IRQ ID is
> > programmed in the list register. When the guest completes the virtual
> > IRQ, a physical maintenance IRQ is triggered. The hyp mode is entered
> > and eventually the host unmasks the IRQ.
> >
> > Some illustrations can be found in
> > http://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf
> 
> 
> I think an important aspect for our design is that in the case of Posted
> Interrupts, they're only used for edge triggered interrupts so VFIO is
> only an information provider for KVM to configure it. 

Exactly! For PI, KVM only needs some information from VFIO when the
guests set the irq affinity.

Thanks,
Feng

 VFIO will
> hopefully just see fewer interrupts as they magically appear directly in
> the guest.  IRQ Forwarding however affects the de-assertion of level
> triggered interrupts.  VFIO needs to switch to something more like an
> edge handler when IRQ Forwarding is enabled.  So in that model, VFIO
> needs to provide information as well as consume it to change behavior.
> Thanks,
> 
> Alex

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

2015-06-24 Thread Wu, Feng


> -Original Message-
> From: Joerg Roedel [mailto:j...@8bytes.org]
> Sent: Wednesday, June 24, 2015 11:46 PM
> To: Alex Williamson
> Cc: Wu, Feng; Eric Auger; Avi Kivity; k...@vger.kernel.org;
> linux-kernel@vger.kernel.org; pbonz...@redhat.com; mtosa...@redhat.com
> Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> 
> On Thu, Jun 18, 2015 at 02:04:08PM -0600, Alex Williamson wrote:
> > There are plenty of details to be filled in,
> 
> I also need to fill plenty of details in my head first, so here are some
> suggestions based on my current understanding. Please don't hesitate to
> correct me if where I got something wrong.
> 
> So first I totally agree that the handling of PI/non-PI configurations
> should be transparent to user-space.
> 
> I read a bit through the VT-d spec, and my understanding of posted
> interrupts so far is that:
> 
>   1) Each VCPU gets a PI-Descriptor with its pending Posted
>  Interrupts. This descriptor needs to be updated when a VCPU
>  is migrated to another PCPU and should thus be under control
>  of KVM.
> 
>  This is similar to the vAPIC backing page in the AMD version
>  of this, except that the PCPU routing information is stored
>  somewhere else on AMD.
> 
>   2) As long as the VCPU runs the IRTEs are configured for
>  posting, when the VCPU goes to sleep the old remapped entry is
>  established again. So when the VCPU sleeps the interrupt
>  would get routed to VFIO and forwarded through the eventfd.

When the vCPU sleeps, says, blocked when guest is running HLT, the
interrupt is still in posted mode. The solution is when the vCPU is blocked,
we use another notification vector (named wakeup notification vector) to
wakeup the blocked vCPU when interrupts happens. And in the wakeup
event handler, we unblock the vCPU.

Thanks,
Feng

> 
>  This would be different to the AMD version, where we have a
>  running bit. When this is clear the IOMMU will trigger an event
>  in its event-log. This might need special handling in VFIO
>  ('might' because VFIO does not need to forward the interrupt,
>   it just needs to make sure the VCPU wakes up).
> 
>  Please correct me if my understanding of the Intel version is
>  wrong.
> 
> So most of the data structures the IOMMU reads for this need to be
> updated from KVM code (either x86-generic or AMD/Intel specific code),
> as KVM has the information about VCPU load/unload and the IRQ routing.

Yes, this part has nothing to do with VFIO, KVM itself can handle it well.

> 
> What KVM needs from VFIO are the informations about the physical
> interrupts, and it makes total sense to attach them as metadata to the
> eventfd.

When guest set the irq affinity, QEMU first gets the MSI/MSIx configuration,
then it passes these information to kernel space via VFIO infrastructure, we
need these MSI/MSIx configuration to update the associated posted-format
IRTE according. This is the key point for PI in term of VFIO.

Thanks,
Feng

> 
> But the problems start at how this metadata should look like. It would
> be good to have some generic description, but not sure if this is
> possible. Otherwise this metadata would need to be requested by VFIO
> from the IOMMU driver and passed on to KVM, which it then passes back to
> the IOMMU driver. Or something like that.
> 
> 
> 
>   Joerg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] net: dsa: add support for switchdev VLAN objects

2015-06-24 Thread Scott Feldman
On Wed, Jun 24, 2015 at 11:50 AM, Vivien Didelot
 wrote:
> This patch adds the glue between DSA and switchdev operations to add,
> delete and dump SWITCHDEV_OBJ_PORT_VLAN objects.
>
> This is a first step to link the "bridge vlan" command with hardware
> entries for DSA compatible switch chips.
>
> Signed-off-by: Vivien Didelot 

Acked-by: Scott Feldman 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] perf/x86: fix SLM MSR_OFFCORE_RSP1 valid_mask

2015-06-24 Thread kan . liang
From: Kan Liang 

AVG_LATENCY(bit 38) is only available on MSR_OFFCORE_RSP0.
So the bit should be removed from RSP1 valid_mask.

Since RSP0 and RSP1 may have different valid_mask, intel_alt_er should
validate the config on the alternate offcore reg before replacing it.

Signed-off-by: Kan Liang 
---
 arch/x86/kernel/cpu/perf_event_intel.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index b9826a9..71815cf 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1114,7 +1114,7 @@ static struct extra_reg intel_slm_extra_regs[] 
__read_mostly =
 {
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x768005ull, 
RSP_0),
-   INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x768005ull, 
RSP_1),
+   INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x368005ull, 
RSP_1),
EVENT_EXTRA_END
 };
 
@@ -1699,18 +1699,22 @@ intel_bts_constraints(struct perf_event *event)
return NULL;
 }
 
-static int intel_alt_er(int idx)
+static int intel_alt_er(int idx, u64 config)
 {
+   int alt_idx;
if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
return idx;
 
if (idx == EXTRA_REG_RSP_0)
-   return EXTRA_REG_RSP_1;
+   alt_idx = EXTRA_REG_RSP_1;
 
if (idx == EXTRA_REG_RSP_1)
-   return EXTRA_REG_RSP_0;
+   alt_idx = EXTRA_REG_RSP_0;
 
-   return idx;
+   if (config & ~x86_pmu.extra_regs[alt_idx].valid_mask)
+   return idx;
+
+   return alt_idx;
 }
 
 static void intel_fixup_er(struct perf_event *event, int idx)
@@ -1799,7 +1803,7 @@ again:
 */
c = NULL;
} else {
-   idx = intel_alt_er(idx);
+   idx = intel_alt_er(idx, reg->config);
if (idx != reg->idx) {
raw_spin_unlock_irqrestore(>lock, flags);
goto again;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [GIT] Networking

2015-06-24 Thread Weiny, Ira
Linus,

> 
> On the *other* side of the same conflict, I find an even more offensive 
> commit,
> namely commit 4cd7c9479aff ("IB/mad: Add support for additional MAD info
> to/from drivers") which adds a BUG_ON() for a sanity check, rather than just
> returning -EINVAL or something sane like that.
> 
> I'm getting *real* tired of that BUG_ON() shit. I realize that infiniband is a
> niche market, and those "commercial grade" niche markets are more-than-
> used-to crap code and horrible hacks, but this is still the kernel. We don't 
> add
> random machine-killing debug checks when it is *so* simple to just do
> 
> if (WARN_ON_ONCE(..))
> return -EINVAL;
> 
> instead.

Please accept my apologies.  The original patch used WARN_ON but I was advised 
to use BUG_ON in a review and I should have thought about it more rather than 
blindly make the change.

> 
> Killing the machine for idiotic things like that is truly offensive, and truly
> horrible horrible code. Why do I keep on having to tell people off for doing
> these things? Why do people keep thinking that debugging-by-killing-the-
> machine is a good idea?
> 
> Either that BUG_ON() cannot possibly happen, in which case it should damn
> well not exist in the first place. Or it's a valuable debug aid, in which 
> case it
> should damn well not be a BUG_ON. You can't have it both ways.

It was intended as a debug aid.

> 
> The next pointless BUG_ON() I see, I will start getting _really_ unpleasant
> about.
> 
> Doug, get rid of those things asap.

Fix submitted to Doug.

https://patchwork.kernel.org/patch/6671931/

Ira



Re: linux-next: build failure after merge of the modules tree

2015-06-24 Thread Stephen Rothwell
Hi Dan,

On Thu, 25 Jun 2015 08:57:06 +1000 Stephen Rothwell  
wrote:
>
> On Wed, 24 Jun 2015 14:18:44 -0400 Dan Streetman  wrote:
> >
> > On Tue, Jun 23, 2015 at 9:37 PM, Stephen Rothwell  
> > wrote:
> > >
> > > After merging the modules tree, today's linux-next build (x86_64
> > > allmodconfig) failed like this:
> > 
> > that's weird.  Are you sure it failed during allmodconfig?  I can see
> > why it would fail like that if CONFIG_MODULES ins't defined, which
> > I'll send a patch for...
> 
> Pretty sure - and, in any case, I don't do any CONFIG_MODULES=n builds
> between tree merges (only later in the day).  That is why I couldn't
> figure out what went wrong.
> 
> I will apply your patch today and see if that helps.

I built without your patch and it failed again, but applying your patch fixes 
it.

Rusty, you can consider this

Tested-by: Stephen Rothwell 

for "[PATCH] modules: only use mod->param_lock if CONFIG_MODULES"
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpZAvvYXZwgD.pgp
Description: OpenPGP digital signature


[PATCH v8 9/9] video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This driver uses the same area for MTRR as for the ioremap().
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Rob Clark 
Cc: Laurent Pinchart 
Cc: Jingoo Han 
Cc: "Lad, Prabhakar" 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tomi Valkeinen 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/vt8623fb.c | 31 ++-
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/drivers/video/fbdev/vt8623fb.c b/drivers/video/fbdev/vt8623fb.c
index ea7f056..60f24828 100644
--- a/drivers/video/fbdev/vt8623fb.c
+++ b/drivers/video/fbdev/vt8623fb.c
@@ -26,13 +26,9 @@
 #include  /* Why should fb driver call console functions? 
because console_lock() */
 #include 
 
-#ifdef CONFIG_MTRR
-#include 
-#endif
-
 struct vt8623fb_info {
char __iomem *mmio_base;
-   int mtrr_reg;
+   int wc_cookie;
struct vgastate state;
struct mutex open_lock;
unsigned int ref_count;
@@ -99,10 +95,7 @@ static struct svga_timing_regs vt8623_timing_regs = {
 /* Module parameters */
 
 static char *mode_option = "640x480-8@60";
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
 
 MODULE_AUTHOR("(c) 2006 Ondrej Zajicek ");
 MODULE_LICENSE("GPL");
@@ -112,11 +105,8 @@ module_param(mode_option, charp, 0644);
 MODULE_PARM_DESC(mode_option, "Default video mode ('640x480-8@60', etc)");
 module_param_named(mode, mode_option, charp, 0);
 MODULE_PARM_DESC(mode, "Default video mode e.g. '648x480-8@60' (deprecated)");
-
-#ifdef CONFIG_MTRR
 module_param(mtrr, int, 0444);
 MODULE_PARM_DESC(mtrr, "Enable write-combining with MTRR (1=enable, 0=disable, 
default=1)");
-#endif
 
 
 /* - */
@@ -710,7 +700,7 @@ static int vt8623_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
info->fix.mmio_len = pci_resource_len(dev, 1);
 
/* Map physical IO memory address into kernel space */
-   info->screen_base = pci_iomap(dev, 0, 0);
+   info->screen_base = pci_iomap_wc(dev, 0, 0);
if (! info->screen_base) {
rc = -ENOMEM;
dev_err(info->device, "iomap for framebuffer failed\n");
@@ -781,12 +771,9 @@ static int vt8623_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
/* Record a reference to the driver data */
pci_set_drvdata(dev, info);
 
-#ifdef CONFIG_MTRR
-   if (mtrr) {
-   par->mtrr_reg = -1;
-   par->mtrr_reg = mtrr_add(info->fix.smem_start, 
info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-   }
-#endif
+   if (mtrr)
+   par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+ 

[PATCH v5 3/3] video: fbdev: atyfb: use arch_phys_wc_add() and ioremap_wc()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This driver uses strong UC for the MMIO region, and ioremap_wc()
for the framebuffer to whitelist for the WC MTRR what can be changed
to WC. On PAT systems we don't need the MTRR call so just use
arch_phys_wc_add() there, this lets us remove all those ifdefs.
Lets also be consistent and use ioremap_wc() for ATARI as well.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Toshi Kani 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Ville Syrjälä 
Cc: Rob Clark 
Cc: Mathias Krause 
Cc: Andrzej Hajda 
Cc: Mel Gorman 
Cc: Vlastimil Babka 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/aty/atyfb.h  |  4 +---
 drivers/video/fbdev/aty/atyfb_base.c | 36 +++-
 2 files changed, 8 insertions(+), 32 deletions(-)

diff --git a/drivers/video/fbdev/aty/atyfb.h b/drivers/video/fbdev/aty/atyfb.h
index 89ec439..63c4842 100644
--- a/drivers/video/fbdev/aty/atyfb.h
+++ b/drivers/video/fbdev/aty/atyfb.h
@@ -182,9 +182,7 @@ struct atyfb_par {
unsigned long irq_flags;
unsigned int irq;
spinlock_t int_lock;
-#ifdef CONFIG_MTRR
-   int mtrr_aper;
-#endif
+   int wc_cookie;
u32 mem_cntl;
struct crtc saved_crtc;
union aty_pll saved_pll;
diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
b/drivers/video/fbdev/aty/atyfb_base.c
index 546f5af..96c605c 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -98,9 +98,6 @@
 #ifdef CONFIG_PMAC_BACKLIGHT
 #include 
 #endif
-#ifdef CONFIG_MTRR
-#include 
-#endif
 
 /*
  * Debug flags.
@@ -303,9 +300,7 @@ static struct fb_ops atyfb_ops = {
 };
 
 static bool noaccel;
-#ifdef CONFIG_MTRR
 static bool nomtrr;
-#endif
 static int vram;
 static int pll;
 static int mclk;
@@ -2628,17 +2623,13 @@ static int aty_init(struct fb_info *info)
aty_st_le32(BUS_CNTL, aty_ld_le32(BUS_CNTL, par) |
BUS_APER_REG_DIS, par);
 
-#ifdef CONFIG_MTRR
-   par->mtrr_aper = -1;
-   if (!nomtrr) {
+   if (!nomtrr)
/*
 * Only the ioremap_wc()'d area will get WC here
 * since ioremap_uc() was used on the entire PCI BAR.
 */
-   par->mtrr_aper = mtrr_add(par->res_start, par->res_size,
- MTRR_TYPE_WRCOMB, 1);
-   }
-#endif
+   par->wc_cookie = arch_phys_wc_add(par->res_start,
+ par->res_size);
 
info->fbops = _ops;
info->pseudo_palette = par->pseudo_palette;
@@ -2766,13 +2757,8 @@ aty_init_exit:
/* restore video mode */
aty_set_crtc(par, >saved_crtc);
par->pll_ops->set_pll(info, >saved_pll);
+   arch_phys_wc_del(par->wc_cookie);
 
-#ifdef CONFIG_MTRR
-   if (par->mtrr_aper >= 0) {
-   mtrr_del(par->mtrr_aper, 0, 0);
-   par->mtrr_aper = -1;
-   }
-#endif
return ret;
 }
 
@@ 

[PATCH v8 8/9] video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This driver uses the same area for MTRR as for the ioremap().
Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Jingoo Han 
Cc: Geert Uytterhoeven 
Cc: Daniel Vetter 
Cc: "Lad, Prabhakar" 
Cc: Rickard Strandqvist 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tomi Valkeinen 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/s3fb.c | 35 ++-
 1 file changed, 6 insertions(+), 29 deletions(-)

diff --git a/drivers/video/fbdev/s3fb.c b/drivers/video/fbdev/s3fb.c
index f0ae61a..13b1090 100644
--- a/drivers/video/fbdev/s3fb.c
+++ b/drivers/video/fbdev/s3fb.c
@@ -28,13 +28,9 @@
 #include 
 #include 
 
-#ifdef CONFIG_MTRR
-#include 
-#endif
-
 struct s3fb_info {
int chip, rev, mclk_freq;
-   int mtrr_reg;
+   int wc_cookie;
struct vgastate state;
struct mutex open_lock;
unsigned int ref_count;
@@ -154,11 +150,7 @@ static const struct svga_timing_regs s3_timing_regs = {
 
 
 static char *mode_option;
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
-
 static int fasttext = 1;
 
 
@@ -170,11 +162,8 @@ module_param(mode_option, charp, 0444);
 MODULE_PARM_DESC(mode_option, "Default video mode ('640x480-8@60', etc)");
 module_param_named(mode, mode_option, charp, 0444);
 MODULE_PARM_DESC(mode, "Default video mode ('640x480-8@60', etc) 
(deprecated)");
-
-#ifdef CONFIG_MTRR
 module_param(mtrr, int, 0444);
 MODULE_PARM_DESC(mtrr, "Enable write-combining with MTRR (1=enable, 0=disable, 
default=1)");
-#endif
 
 module_param(fasttext, int, 0644);
 MODULE_PARM_DESC(fasttext, "Enable S3 fast text mode (1=enable, 0=disable, 
default=1)");
@@ -1168,7 +1157,7 @@ static int s3_pci_probe(struct pci_dev *dev, const struct 
pci_device_id *id)
info->fix.smem_len = pci_resource_len(dev, 0);
 
/* Map physical IO memory address into kernel space */
-   info->screen_base = pci_iomap(dev, 0, 0);
+   info->screen_base = pci_iomap_wc(dev, 0, 0);
if (! info->screen_base) {
rc = -ENOMEM;
dev_err(info->device, "iomap for framebuffer failed\n");
@@ -1365,12 +1354,9 @@ static int s3_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
/* Record a reference to the driver data */
pci_set_drvdata(dev, info);
 
-#ifdef CONFIG_MTRR
-   if (mtrr) {
-   par->mtrr_reg = -1;
-   par->mtrr_reg = mtrr_add(info->fix.smem_start, 
info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-   }
-#endif
+   if (mtrr)
+   par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+ info->fix.smem_len);
 
return 0;
 
@@ -1405,14 +1391,7 @@ static void s3_pci_remove(struct pci_dev *dev)
 
  

Re: Stop SSD from waiting for "Spinning up disk..."

2015-06-24 Thread Greg Kroah-Hartman
On Thu, Jun 25, 2015 at 07:55:45AM +0800, Jeff Chua wrote:
> On Thu, Jun 25, 2015 at 12:28 AM, Greg Kroah-Hartman
>  wrote:
> > On Thu, Jun 25, 2015 at 12:22:47AM +0800, Jeff Chua wrote:
> 
> >> Both sda and sdb have the same SSD model.
> >
> > That's a bug in your USB bridge chip, odds are it is not reporting the
> > value properly.  There's nothing the scsi core or USB stack can do about
> > this, sorry.  Please complain to the hardware manufacturer.
> 
> There are workaround boot cmdline parameters for other things ... any
> chance to consider  one to fix broken rotational option? I'm not sure
> how many out there are broken, but I really would like a faster way to
> access my USB SSD without waiting for the "disk spinup".

Just like module paramaters, boot command lines are not for device
specific attributes, sorry.  Again, please contact the manufacturer to
get this fixed.  We can't add a quirk for this bridge because it would
not work if you really put a rotational disk behind it.

Given the cheap cost of these types of bridges, I recommend just getting
one that works.

Best of luck,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 2/3] video: fbdev: atyfb: replace MTRR UC hole with strong UC

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Replace a WC MTRR call followed by a UC MTRR "hole" call
with a single WC MTRR call and use strong UC to protect
the MMIO region and account for the device's architecture
and MTRR size requirements.

The atyfb driver relies on two overlapping MTRRs. It
does this to account for the fact that on some devices
it has the MMIO region bundled together with the framebuffer
on the same PCI BAR and the hardware requirement on
MTRRs on both base and size to be powers of two. In the
atyfb driver's case in the worst case the PCI BAR is
of 16 MiB while the MMIO region is on the last 4 KiB of
the same PCI BAR. If we use just one MTRR for WC we can
only end up with an 8 MiB or 16 MiB framebuffer. Using a
16 MiB WC framebuffer area is unacceptable since we need
the MMIO region to not be write-combined. An 8 MiB WC
framebuffer option does not let use quite a bit of framebuffer
space, it would reduce the resolution capability of the device
considerably. An alternative is to use many MTRRs but on
some systems that could mean not having not enough MTRRs
to cover the framebuffer. The current driver solution is
to issue a 16 MiB WC MTRR followed by a 4 KiB UC MTRR on
the last 4 KiB. Its worth mentioning and documenting that
the current ioremap*() strategy as well: the first ioremap()
is used only for the MMIO region, a second ioremap() call
is used for the framebuffer *and* the MMIO region, the MMIO
region then ends up mmap'd twice. Two ioremap() calls are
used since in some situations the framebuffer actually ends
up on a separate auxiliary PCI BAR, but this is not always
true, in the worst case the PCI BAR is shared for both
MMIO and the framebuffer. By allowing overlapping ioremap()
calls the driver enables two types of devices with one
simple ioremap() strategy.

For non PAT systems:

As per Intel SDM "11.5.2.1 Selecting Memory Types for Pentium
Pro and Pentium II Processors" [0] the effect of a WC MTRR for
a region with page attribute settings set to PCD=1, PWT=1
(Linux _PAGE_CACHE_MODE_UC) will render the effective memory
type to UC. A WC MTRR for a region with page attribute settings
set to PCD=1, PWT=0 (Linux _PAGE_CACHE_MODE_UC_MINUS) will render
the effective memory type to WC *but* yet this is considered
implementation defined -- that is, "system designers are
encouraged to avoid these implementation-defined combinations".
A WC MTRR for a region with page attribute settings set to
PCD=0, PWT=1 (Linux _PAGE_CACHE_MODE_WC) will render the
effective memory type to WC *but* this is also implementation
defined. Such is the case for non-PAT systems.

For PAT systems:

As per Intel SDM "11.5.2.2 Selecting Memory Types for Pentium
III and More Recent Processor Families" the ffect of a WC MTRR
for a region with a PAT entry value of UC will be UC. The effect
of a WC MTRR on a region with a PAT entry UC- will be WC. The
effect of a WC MTRR on a regoin with PAT entry WC is WC.

This can all be summarized in the following table:

--
MTRR Non-PAT   PATLinux ioremap valueEffective memory type
--
  Non-PAT |  PAT
 PAT
 |PCD
 ||PWT
 |||
WC   000  WB  _PAGE_CACHE_MODE_WBWC   |   WC
WC   001  WC  _PAGE_CACHE_MODE_WCWC*  |   WC
WC   010  UC- _PAGE_CACHE_MODE_UC_MINUS  WC*  |   UC
WC   011  UC  _PAGE_CACHE_MODE_UCUC   |   UC
--

 (*) denotes implementation defined

By default Linux today defaults both and ioremap_nocache()
to use _PAGE_CACHE_MODE_UC_MINUS. On x86 ioremap() aliases
ioremap_nocache(). The preferred value for Linux by may soon
change however, the goal is to use _PAGE_CACHE_MODE_UC by
default in the future.

We can use ioremap_uc() to set PCD=1, PWT=1 on non-PAT systems
and use a PAT value of UC for PAT systems. This will ensure the
same settings are in place regardless of what Linux decides to
use by default later and to not regress our MTRR strategy since
the effective memory type will differ depending on the value used.
Using a WC MTRR on such an area will be nullified. This technique
can be used to protect the MMIO region in this driver's case and
address the restrictions of the device's architecture as well as
restrictions set upon us by powers of 2 when using MTRRs.

This allows us to replace the two MTRR calls with a single
16 MiB WC MTRR and use page-attribute settings for non-PAT
and PAT entry values for PAT systems to ensure the
appropriate effective memory type won't have a write-combined
effect on the MMIO region on both non-PAT and PAT systems.
The framebuffer area will be sure to get the write-combined
effective memory type by white-listing it with ioremap_wc().

We ensure the desired effective memory types are set by:

0) Using one 

[PATCH v8 7/9] video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Laurent Pinchart 
Cc: Geert Uytterhoeven 
Cc: "Lad, Prabhakar" 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tomi Valkeinen 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/arkfb.c | 36 +---
 1 file changed, 5 insertions(+), 31 deletions(-)

diff --git a/drivers/video/fbdev/arkfb.c b/drivers/video/fbdev/arkfb.c
index b305a1e..6a317de 100644
--- a/drivers/video/fbdev/arkfb.c
+++ b/drivers/video/fbdev/arkfb.c
@@ -26,13 +26,9 @@
 #include  /* Why should fb driver call console functions? 
because console_lock() */
 #include 
 
-#ifdef CONFIG_MTRR
-#include 
-#endif
-
 struct arkfb_info {
int mclk_freq;
-   int mtrr_reg;
+   int wc_cookie;
 
struct dac_info *dac;
struct vgastate state;
@@ -102,10 +98,6 @@ static const struct svga_timing_regs ark_timing_regs = {
 
 static char *mode_option = "640x480-8@60";
 
-#ifdef CONFIG_MTRR
-static int mtrr = 1;
-#endif
-
 MODULE_AUTHOR("(c) 2007 Ondrej Zajicek ");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("fbdev driver for ARK 2000PV");
@@ -115,11 +107,6 @@ MODULE_PARM_DESC(mode_option, "Default video mode 
('640x480-8@60', etc)");
 module_param_named(mode, mode_option, charp, 0444);
 MODULE_PARM_DESC(mode, "Default video mode ('640x480-8@60', etc) 
(deprecated)");
 
-#ifdef CONFIG_MTRR
-module_param(mtrr, int, 0444);
-MODULE_PARM_DESC(mtrr, "Enable write-combining with MTRR (1=enable, 0=disable, 
default=1)");
-#endif
-
 static int threshold = 4;
 
 module_param(threshold, int, 0644);
@@ -1002,7 +989,7 @@ static int ark_pci_probe(struct pci_dev *dev, const struct 
pci_device_id *id)
info->fix.smem_len = pci_resource_len(dev, 0);
 
/* Map physical IO memory address into kernel space */
-   info->screen_base = pci_iomap(dev, 0, 0);
+   info->screen_base = pci_iomap_wc(dev, 0, 0);
if (! info->screen_base) {
rc = -ENOMEM;
dev_err(info->device, "iomap for framebuffer failed\n");
@@ -1057,14 +1044,8 @@ static int ark_pci_probe(struct pci_dev *dev, const 
struct pci_device_id *id)
 
/* Record a reference to the driver data */
pci_set_drvdata(dev, info);
-
-#ifdef CONFIG_MTRR
-   if (mtrr) {
-   par->mtrr_reg = -1;
-   par->mtrr_reg = mtrr_add(info->fix.smem_start, 
info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-   }
-#endif
-
+   par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+ info->fix.smem_len);
return 0;
 
/* Error handling */
@@ -1092,14 +1073,7 @@ static void ark_pci_remove(struct pci_dev *dev)
 
if (info) {
struct 

[PATCH v5 1/3] video: fbdev: atyfb: clarify ioremap() base and length used

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This has no functional changes, it just adjusts
the ioremap() call for the framebuffer to use
the same values we later use for the framebuffer,
this will make it easier to review the next change.

The size of the framebuffer varies but since this is
for PCI we *know* this defaults to 0x80.
atyfb_setup_generic() is *only* used on PCI probe.

Cc: Toshi Kani 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Ville Syrjälä 
Cc: Rob Clark 
Cc: Mathias Krause 
Cc: Andrzej Hajda 
Cc: Mel Gorman 
Cc: Vlastimil Babka 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/aty/atyfb_base.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/aty/atyfb_base.c 
b/drivers/video/fbdev/aty/atyfb_base.c
index 16936bb..8025624 100644
--- a/drivers/video/fbdev/aty/atyfb_base.c
+++ b/drivers/video/fbdev/aty/atyfb_base.c
@@ -3489,7 +3489,9 @@ static int atyfb_setup_generic(struct pci_dev *pdev, 
struct fb_info *info,
 
/* Map in frame buffer */
info->fix.smem_start = addr;
-   info->screen_base = ioremap(addr, 0x80);
+   info->fix.smem_len = 0x80;
+
+   info->screen_base = ioremap(info->fix.smem_start, info->fix.smem_len);
if (info->screen_base == NULL) {
ret = -ENOMEM;
goto atyfb_setup_generic_fail;
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 6/9] lib: devres: add pcim_iomap_wc() variants

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Now that we have pci_iomap_wc() add the respective
devres helpers. These go unexported for now but
note that should they later be exported this
must go with EXPORT_SYMBOL_GPL().

Cc: Toshi Kani 
Cc: Andy Lutomirski 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: Bjorn Helgaas 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Dave Hansen 
Cc: Arnd Bergmann 
Cc: Michael S. Tsirkin 
Cc: venkatesh.pallip...@intel.com
Cc: Stefan Bader 
Cc: Ville Syrjälä 
Cc: Mel Gorman 
Cc: Vlastimil Babka 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: konrad.w...@oracle.com
Cc: ville.syrj...@linux.intel.com
Cc: david.vra...@citrix.com
Cc: jbeul...@suse.com
Cc: Roger Pau Monné 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-de...@lists.xensource.com
Acked-by: Arnd Bergmann 
Signed-off-by: Luis R. Rodriguez 
---
 include/linux/pci.h |  2 ++
 lib/devres.c| 76 +
 2 files changed, 78 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1193975..5ff15c1 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1609,9 +1609,11 @@ static inline void pci_dev_specific_enable_acs(struct 
pci_dev *dev) { }
 #endif
 
 void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen);
+void __iomem *pcim_iomap_wc(struct pci_dev *pdev, int bar, unsigned long 
maxlen);
 void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr);
 void __iomem * const *pcim_iomap_table(struct pci_dev *pdev);
 int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name);
+int pcim_iomap_wc_regions(struct pci_dev *pdev, int mask, const char *name);
 int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
   const char *name);
 void pcim_iounmap_regions(struct pci_dev *pdev, int mask);
diff --git a/lib/devres.c b/lib/devres.c
index fbe2aac..38acc53 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -304,6 +304,29 @@ void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, 
unsigned long maxlen)
 EXPORT_SYMBOL(pcim_iomap);
 
 /**
+ * pcim_iomap_wc - Managed pcim_iomap_wc()
+ * @pdev: PCI device to iomap for
+ * @bar: BAR to iomap
+ * @maxlen: Maximum length of iomap
+ *
+ * Managed pci_iomap_wc().  Map is automatically unmapped on driver
+ * detach.
+ */
+void __iomem *pcim_iomap_wc(struct pci_dev *pdev, int bar, unsigned long 
maxlen)
+{
+   void __iomem **tbl;
+
+   BUG_ON(bar >= PCIM_IOMAP_MAX);
+
+   tbl = (void __iomem **)pcim_iomap_table(pdev);
+   if (!tbl || tbl[bar])   /* duplicate mappings not allowed */
+   return NULL;
+
+   tbl[bar] = pci_iomap_wc(pdev, bar, maxlen);
+   return tbl[bar];
+}
+
+/**
  * pcim_iounmap - Managed pci_iounmap()
  * @pdev: PCI device to iounmap for
  * @addr: Address to unmap
@@ -383,6 +406,59 @@ int pcim_iomap_regions(struct pci_dev *pdev, int mask, 
const char *name)
 EXPORT_SYMBOL(pcim_iomap_regions);
 
 /**
+ * pcim_iomap_wc_regions - Request and iomap PCI BARs with write-combining
+ * @pdev: PCI device to map IO resources for
+ * @mask: Mask of BARs to request and iomap
+ * @name: Name used when requesting regions
+ *
+ * Request and iomap regions specified by @mask with a preference for
+ * write-combining.
+ */
+int pcim_iomap_wc_regions(struct pci_dev *pdev, int mask, const char *name)
+{
+   void __iomem * const *iomap;
+   int i, rc;
+
+   iomap = pcim_iomap_table(pdev);
+   if (!iomap)
+   return -ENOMEM;
+
+   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+   unsigned long len;
+
+   if (!(mask & (1 << i)))
+   continue;
+
+   rc = -EINVAL;
+   len = pci_resource_len(pdev, i);
+   if (!len)
+   goto err_inval;
+
+   rc = pci_request_region(pdev, i, name);
+   if (rc)
+   goto err_inval;
+
+   rc = -ENOMEM;
+   if (!pcim_iomap_wc(pdev, i, 0))
+   goto err_region;
+   }
+
+   return 0;
+
+ err_region:
+   pci_release_region(pdev, i);
+ err_inval:
+   while (--i >= 0) {
+   if (!(mask & (1 << i)))
+   continue;
+   pcim_iounmap(pdev, iomap[i]);
+   pci_release_region(pdev, i);
+   }
+
+   return rc;
+}
+
+/**
  * pcim_iomap_regions_request_all - Request all BARs and iomap specified ones
  * @pdev: PCI device to map IO resources for
  * @mask: Mask of BARs to iomap
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/3] atyfb: address MTRR corner case

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Andrew,

Forgive me for the TL;DR, I'm afraid I need to be crystal clear on this
patchset as its the most complex in the entire series. The skinny is that this
patchset addresses a complex work around with APIs now merged upstream going in
for v4.2, the driver maintainer hasn't followed up with the driver changes for
over a month and no one else has provided Acks for these device driver
changes [0]. We have a few options:

0) Sit and wait for a driver maintainer to review this
1) Merge this as-is and hope for reports
2) go with the nopat requirement as with the ivtv and ipath driver

I'd prefer to merge this as is, and only if reports come back with
issues should we then consider 2) as we'd then have at least a well
documented work effort required for this transformation. This device
driver is also old, so I don't expect much reports anyway.



The TL;DR:

As part of the long haul effort to rid the world of direct MTRR use [1] we've
have had to also work on alternative solutions which can co-exist with PAT
interfaces. Most of the transformation of device drivers to use PAT was fairly
easy (TM): so long as ioremap_wc() was used we could then convert over the
drivers using mtrr_add() over to the arch-agnostic and PAT-aware (ignored when
PAT is enabled) arch_phys_wc_add(). This was typically easy to do, for instance
in cases where a full PCI BAR was used for MMIO registers and another PCI BAR
was used with write-combining effects desirable. In some cases we just needed
new WC apis for some buses. This was the case for most modern devices, but a
few old devices had a combined set of MMIO registers and the write-combined
area mixed. In such situations even when using MTRR one had to figure out
creative solutions to make things work, specially considering MTRRs were
limited and they had size constraints: an MTRR base and size must be
a power of two.

The good news is that on Linux there were only three device drivers in total
that we ended up with radical issue with when converting them over to PAT
interfaces. One was with the ivtv media device driver, another was the
infiniband ipath device driver. The other one was the framebuffer atyfb
device driver that this series addresses. For both ivtv and ipath we've
decided to simply require users of those devices to boot with the nopat
kernel parameter because both devices drivers are ancient and the work
required to fully convert to PAT interfaces is significant (in the ipath case)
or nearly almost impossible (ivtv). For details please refer to the respective
and now upstream commits:

7ea402d x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and 
require PAT disabled
1bf1735 x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT 
disabled

To demo exactly how much effort would have been required I decided to venture
into atyfb and try to fix that device driver first, considering it had the
worst case situation to address as it used size hackery and MTRR combinations
of different types. In order to accomplish this we needed to map out all
possible combinatorial effects of PAT page entries with write-combining, and
page attributes (PAT, PCD, PWT) with write-combining effects for non-PAT
systems. We did this not only for atyfb's sake but also for any other possible
future driver which might meet these same needs. We needed to take this a bit
more seriously given that our long term goal was also to change the default
behaviour of ioremap_nocache() to use strong UC instead of UC-, we needed
to take this into consideration when converting drivers over. The documentation
table for all these possible combinatorial entries is now upstream:

2f9e897 x86/mm/mtrr, pat: Document Write Combining MTRR type effects on PAT / 
non-PAT pages

Of importance to this patch set is this table:

--
MTRR Non-PAT   PATLinux ioremap valueEffective memory type
--
  Non-PAT |  PAT
 PAT
 |PCD
 ||PWT
 |||
WC   000  WB  _PAGE_CACHE_MODE_WBWC   |   WC
WC   001  WC  _PAGE_CACHE_MODE_WCWC*  |   WC
WC   010  UC- _PAGE_CACHE_MODE_UC_MINUS  WC*  |   UC
WC   011  UC  _PAGE_CACHE_MODE_UCUC   |   UC
--

In the atyfb case it used to use two MTRR calls, a large WC MTRR followed by a
UC MTRR "hole" call for the MMIO registers. This was done this way on atyfb
because of the offset and size of the framebuffer area would only work well
this way, otherwise you'd also have to try a series of small MTRR calls and you
might end up running out of MTRRs. For non-PAT systems we take advantage of the
above map to protect an MMIO region with 011 page attributes (this maps to
strong UC for PAT systems) so that if a 

[PATCH v8 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

PCI BARs tell us whether prefetching is safe, but they don't say anything
about write combining (WC).  WC changes ordering rules and allows writes to
be collapsed, so it's not safe in general to use it on a prefetchable
region.

Add pci_iomap_wc() and pci_iomap_wc_range() so drivers can take advantage
of write combining when they know it's safe.

On architectures that don't fully support WC, e.g., x86 without PAT,
drivers for legacy framebuffers may get some of the benefit by using
arch_phys_wc_add() in addition to pci_iomap_wc().  But arch_phys_wc_add()
is unreliable and should be avoided in general.  On x86, it uses MTRRs,
which are limited in number and size, so the results will vary based on
driver loading order.

The goals of adding pci_iomap_wc() are to:

- Give drivers an architecture-independent way to use WC so they can stop
  using interfaces like mtrr_add() (on x86, pci_iomap_wc() uses
  PAT when available)

- Move toward using _PAGE_CACHE_MODE_UC, not _PAGE_CACHE_MODE_UC_MINUS,
  on x86 on ioremap_nocache() (see de33c442ed2a ("x86 PAT: fix
  performance drop for glx, use UC minus for ioremap(), ioremap_nocache()
  and pci_mmap_page_range()")

Link: 
http://lkml.kernel.org/r/1426893517-2511-6-git-send-email-mcg...@do-not-panic.com
Original-posting: 
http://lkml.kernel.org/r/1432163293-20965-1-git-send-email-mcg...@do-not-panic.com
Cc: Toshi Kani 
Cc: Andy Lutomirski 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: Bjorn Helgaas 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Dave Hansen 
Cc: Arnd Bergmann 
Cc: Michael S. Tsirkin 
Cc: venkatesh.pallip...@intel.com
Cc: Stefan Bader 
Cc: Ville Syrjälä 
Cc: Mel Gorman 
Cc: Vlastimil Babka 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: konrad.w...@oracle.com
Cc: ville.syrj...@linux.intel.com
Cc: david.vra...@citrix.com
Cc: jbeul...@suse.com
Cc: Roger Pau Monné 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-de...@lists.xensource.com
Acked-by: Arnd Bergmann 
Signed-off-by: Luis R. Rodriguez 
---
 include/asm-generic/pci_iomap.h | 14 ++
 lib/pci_iomap.c | 61 +
 2 files changed, 75 insertions(+)

diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h
index 7389c87..b1e17fc 100644
--- a/include/asm-generic/pci_iomap.h
+++ b/include/asm-generic/pci_iomap.h
@@ -15,9 +15,13 @@ struct pci_dev;
 #ifdef CONFIG_PCI
 /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
 extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long 
max);
+extern void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long 
max);
 extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
 unsigned long offset,
 unsigned long maxlen);
+extern void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
+   unsigned long offset,
+   unsigned long maxlen);
 /* Create a virtual mapping cookie for a port on a given PCI device.
  * Do not call this directly, it exists to make it easier for architectures
  * to override */
@@ -34,12 +38,22 @@ static inline void __iomem *pci_iomap(struct pci_dev *dev, 
int bar, unsigned lon
return NULL;
 }
 
+static inline void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, 
unsigned long max)
+{
+   return NULL;
+}
 static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
unsigned long offset,
unsigned long maxlen)
 {
return NULL;
 }
+static inline void __iomem *pci_iomap_wc_range(struct pci_dev *dev, int bar,
+  unsigned long offset,
+  unsigned long maxlen)
+{
+   return NULL;
+}
 #endif
 
 #endif /* __ASM_GENERIC_IO_H */
diff --git a/lib/pci_iomap.c b/lib/pci_iomap.c
index bcce5f1..9604dcb 100644
--- a/lib/pci_iomap.c
+++ b/lib/pci_iomap.c
@@ -52,6 +52,46 @@ void __iomem *pci_iomap_range(struct pci_dev *dev,
 EXPORT_SYMBOL(pci_iomap_range);
 
 /**
+ * pci_iomap_wc_range - create a virtual WC mapping cookie for a PCI BAR
+ * @dev: PCI device that owns the BAR
+ * @bar: BAR number
+ * @offset: map memory at the given offset in BAR
+ * @maxlen: max length of the memory to map
+ *
+ * Using this function you will get a __iomem address to your device BAR.
+ * You can access it using ioread*() and iowrite*(). These functions hide
+ * the details if this is a MMIO or PIO address space and will just do what
+ * you expect from them in the correct way. When possible write combining
+ * is used.
+ *
+ * @maxlen specifies the maximum length to map. If you want to get access to
+ * the complete BAR from offset to 

Re: [PATCH v4 4/9] staging:lustre: merge socklnd_lib-linux.h into socklnd.h

2015-06-24 Thread Guenter Roeck
On Wed, Jun 24, 2015 at 02:37:51PM +0200, Geert Uytterhoeven wrote:
> Hi James,
> 
> On Thu, Jun 11, 2015 at 9:18 PM, James Simmons  wrote:
> > From: John L. Hammond 
> >
> > Originally socklnd_lib-linux.h contained linux specific
> > wrappers and defines but since the linux kernel is the
> > only supported platform now we can merge what little
> > remains in the header into socklnd.h. This is broken
> > out of the original patch 12932 that was merged to the
> > Intel/OpenSFS branch.
> >
> > Signed-off-by: John L. Hammond 
> > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675
> > Reviewed-on: http://review.whamcloud.com/12932
> > Reviewed-by: Isaac Huang 
> > Reviewed-by: James Simmons 
> > Reviewed-by: Oleg Drokin 
> > Signed-off-by: James Simmons 
> > ---
> >  .../staging/lustre/lnet/klnds/socklnd/socklnd.h|   39 +-
> >  .../lustre/lnet/klnds/socklnd/socklnd_lib-linux.h  |   86 
> > 
> >  .../lustre/lnet/klnds/socklnd/socklnd_lib.c|4 +-
> >  3 files changed, 40 insertions(+), 89 deletions(-)
> >  delete mode 100644 
> > drivers/staging/lustre/lnet/klnds/socklnd/socklnd_lib-linux.h
> >
> > diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h 
> > b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
> > index 53275f9..7125eb9 100644
> > --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
> > +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
> > @@ -25,16 +25,40 @@
> >   *
> >   */
> >
> > +#ifndef _SOCKLND_SOCKLND_H_
> > +#define _SOCKLND_SOCKLND_H_
> > +
> >  #define DEBUG_PORTAL_ALLOC
> >  #define DEBUG_SUBSYSTEM S_LND
> >
> > -#include "socklnd_lib-linux.h"
> > +#include 
> > +#include 
> > +#include 
> 
> Including  first causes a build failure for m68k/allmodconfig:
> 
> arch/m68k/include/asm/irq.h:77:12: error: expected '=', ',', ';',
> 'asm' or '__attribute__' before 'void'
> arch/m68k/include/asm/irq.h:78:1: error: unknown type name 'atomic_t'
> arch/m68k/include/asm/irq.h:77:12: error: expected '=', ',', ';',
> 'asm' or '__attribute__' before 'void'
> arch/m68k/include/asm/irq.h:78:1: error: unknown type name 'atomic_t'
> 
> http://kisskb.ellerman.id.au/kisskb/buildresult/12448922/
> 
> Fixing it inside arch/m68k/include/asm/irq.h might cause Include Hell,
> so perhaps you can just move the  include below all 
> includes?
> 
Hi Geert,

I have not tested it, but I think the following may fix the problem
while avoiding any include problems. Since pt_regs is used in the file,
one could argue that it should be declared. 

Thanks,
Guenter

--
diff --git a/arch/m68k/include/asm/irq.h b/arch/m68k/include/asm/irq.h
index 81ca118d58af..28ffa8d59cf0 100644
--- a/arch/m68k/include/asm/irq.h
+++ b/arch/m68k/include/asm/irq.h
@@ -74,6 +74,8 @@ extern unsigned int irq_canonicalize(unsigned int irq);
 #define irq_canonicalize(irq)  (irq)
 #endif /* !(CONFIG_M68020 || CONFIG_M68030 || CONFIG_M68040 || CONFIG_M68060) 
*/
 
+struct pt_regs;
+
 asmlinkage void do_IRQ(int irq, struct pt_regs *regs);
 extern atomic_t irq_err_count;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 4/9] video: fbdev: gxt4500: use pci_ioremap_wc_bar() for framebuffer

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

The driver doesn't use mtrr_add() or arch_phys_wc_add() but
since we know the framebuffer is isolated already on an
ioremap() we can take advantage of write combining for
performance where possible.

In this case there are a few motivations for this:

a) Take advantage of PAT when available

b) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

Cc: Laurent Pinchart 
Cc: Rob Clark 
Cc: Geert Uytterhoeven 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tomi Valkeinen 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/gxt4500.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/gxt4500.c b/drivers/video/fbdev/gxt4500.c
index 135d78a..f19133a 100644
--- a/drivers/video/fbdev/gxt4500.c
+++ b/drivers/video/fbdev/gxt4500.c
@@ -662,7 +662,7 @@ static int gxt4500_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
 
info->fix.smem_start = fb_phys;
info->fix.smem_len = pci_resource_len(pdev, 1);
-   info->screen_base = pci_ioremap_bar(pdev, 1);
+   info->screen_base = pci_ioremap_wc_bar(pdev, 1);
if (!info->screen_base) {
dev_err(>dev, "gxt4500: cannot map framebuffer\n");
goto err_unmap_regs;
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] virtio_net: Adding tx_timeout function.

2015-06-24 Thread Julio Faracco
2015-06-24 3:10 GMT-03:00 Michael S. Tsirkin :
>
> On Tue, Jun 23, 2015 at 10:44:29PM -0300, Julio Faracco wrote:
> > virtio_net paravirtualized driver does not have a tx_timeout() function to
> > guarantee that the driver will recover properly after receiving a timeout
> > during a transmission of a packet. This patch add this feature and throw a
> > timeout exception after 5 HZ. Considering some tests, this is the best
> > time to use here.
> >
> > Signed-off-by: Julio Faracco 
> > Cc: Jason Wang 
>
> Looks like a bunch of locks and flushes are missing in this patch.  IMHO
> that's just too painful with current hardware.  IMO the right thing to
> do here is to add ability to reset specific queues to hardware.
>

I agree, Michael. This model is the default one resetting the device
due to transmission timeout.
To have a better performance, only some queues must be reset.

> > ---
> >  drivers/net/virtio_net.c |   69 
> > +-
> >  1 file changed, 68 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 63c7810..75ac45c 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -135,6 +135,9 @@ struct virtnet_info {
> >   /* Work struct for config space updates */
> >   struct work_struct config_work;
> >
> > + /* Work struct for resetting the virtio-net driver. */
> > + struct work_struct reset_task;
> > +
> >   /* Does the affinity hint is set for virtqueues? */
> >   bool affinity_hint_set;
> >
> > @@ -1394,6 +1397,18 @@ static int virtnet_change_mtu(struct net_device 
> > *dev, int new_mtu)
> >   return 0;
> >  }
> >
> > +static void virtnet_tx_timeout(struct net_device *dev)
> > +{
> > + struct virtnet_info *vi = netdev_priv(dev);
> > +
> > + dev_warn(>dev, "TX Timeout exception with latency: %ld\n",
> > +  jiffies - dev_trans_start(dev));
> > +
> > + schedule_work(>reset_task);
>
> What if after this triggers user does something
> to the device (e.g. attempts to remove it)?
> Or if a packet is transmitted or used?

At some point, this work must be canceled.
Yes, you are right. Specially, when the driver is being removed.
>
> > +}
> > +
> > +static void virtnet_reset_task(struct work_struct *work);
> > +
> >  static const struct net_device_ops virtnet_netdev = {
> >   .ndo_open= virtnet_open,
> >   .ndo_stop= virtnet_close,
> > @@ -1405,6 +1420,7 @@ static const struct net_device_ops virtnet_netdev = {
> >   .ndo_get_stats64 = virtnet_stats,
> >   .ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid,
> >   .ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid,
> > + .ndo_tx_timeout  = virtnet_tx_timeout,
> >  #ifdef CONFIG_NET_POLL_CONTROLLER
> >   .ndo_poll_controller = virtnet_netpoll,
> >  #endif
> > @@ -1750,6 +1766,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> >   dev->netdev_ops = _netdev;
> >   dev->features = NETIF_F_HIGHDMA;
> >
> > + dev->watchdog_timeo = 5 * HZ;
> >   dev->ethtool_ops = _ethtool_ops;
> >   SET_NETDEV_DEV(dev, >dev);
> >
> > @@ -1811,6 +1828,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> >   }
> >
> >   INIT_WORK(>config_work, virtnet_config_changed_work);
> > + INIT_WORK(>reset_task, virtnet_reset_task);
> >
> >   /* If we can receive ANY GSO packets, we must allocate large ones. */
> >   if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> > @@ -1891,7 +1909,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> >   netif_carrier_on(dev);
> >   }
> >
> > - pr_debug("virtnet: registered device %s with %d RX and TX vq's\n",
> > + pr_debug("virtio_net: registered device %s with %d RX and TX vq's\n",
> >dev->name, max_queue_pairs);
> >
> >   return 0;
> > @@ -2001,6 +2019,55 @@ static int virtnet_restore(struct virtio_device 
> > *vdev)
> >  }
> >  #endif
> >
> > +static void virtnet_reset_task(struct work_struct *work)
> > +{
> > + struct virtnet_info *vi =
> > + container_of(work, struct virtnet_info, reset_task);
> > + struct net_device *dev = vi->dev;
> > + struct virtio_device *vdev = vi->vdev;
> > + int err, i;
> > +
> > + flush_work(>config_work);
> > +
> > + netif_device_detach(vi->dev);
> > + cancel_delayed_work_sync(>refill);
> > +
> > + if (netif_running(vi->dev)) {
> > + for (i = 0; i < vi->max_queue_pairs; i++) {
> > + napi_disable(>rq[i].napi);
> > + napi_hash_del(>rq[i].napi);
> > + netif_napi_del(>rq[i].napi);
> > + }
> > + }
> > +
> > + remove_vq_common(vi);
> > +
> > + dev->stats.tx_errors++;
> > +
> > + err = init_vqs(vi);
> > + if (err) {
> > + dev_warn(>dev, "virtio_net: virtqueue initialization 
> > failed.\n");
> > + return;
> > + }
> > +
> 

[PATCH v8 3/9] video: fbdev: kyrofb: use arch_phys_wc_add() and pci_ioremap_wc_bar()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Jingoo Han 
Cc: Geert Uytterhoeven 
Cc: Laurent Pinchart 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tomi Valkeinen 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/kyro/fbdev.c | 33 +++--
 include/video/kyro.h |  4 +---
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/video/fbdev/kyro/fbdev.c b/drivers/video/fbdev/kyro/fbdev.c
index 65041e1..5bb0153 100644
--- a/drivers/video/fbdev/kyro/fbdev.c
+++ b/drivers/video/fbdev/kyro/fbdev.c
@@ -22,9 +22,6 @@
 #include 
 #include 
 #include 
-#ifdef CONFIG_MTRR
-#include 
-#endif
 
 #include 
 
@@ -84,9 +81,7 @@ static device_info_t deviceInfo;
 static char *mode_option = NULL;
 static int nopan = 0;
 static int nowrap = 1;
-#ifdef CONFIG_MTRR
 static int nomtrr = 0;
-#endif
 
 /* PCI driver prototypes */
 static int kyrofb_probe(struct pci_dev *pdev, const struct pci_device_id *ent);
@@ -570,10 +565,8 @@ static int __init kyrofb_setup(char *options)
nopan = 1;
} else if (strcmp(this_opt, "nowrap") == 0) {
nowrap = 1;
-#ifdef CONFIG_MTRR
} else if (strcmp(this_opt, "nomtrr") == 0) {
nomtrr = 1;
-#endif
} else {
mode_option = this_opt;
}
@@ -691,17 +684,16 @@ static int kyrofb_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 
currentpar->regbase = deviceInfo.pSTGReg =
ioremap_nocache(kyro_fix.mmio_start, kyro_fix.mmio_len);
+   if (!currentpar->regbase)
+   goto out_free_fb;
 
-   info->screen_base = ioremap_nocache(kyro_fix.smem_start,
-   kyro_fix.smem_len);
+   info->screen_base = pci_ioremap_wc_bar(pdev, 0);
+   if (!info->screen_base)
+   goto out_unmap_regs;
 
-#ifdef CONFIG_MTRR
if (!nomtrr)
-   currentpar->mtrr_handle =
-   mtrr_add(kyro_fix.smem_start,
-kyro_fix.smem_len,
-MTRR_TYPE_WRCOMB, 1);
-#endif
+   currentpar->wc_cookie = arch_phys_wc_add(kyro_fix.smem_start,
+kyro_fix.smem_len);
 
kyro_fix.ypanstep   = nopan ? 0 : 1;
kyro_fix.ywrapstep  = nowrap ? 0 : 1;
@@ -745,8 +737,10 @@ static int kyrofb_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
return 0;
 
 out_unmap:
-   iounmap(currentpar->regbase);
iounmap(info->screen_base);
+out_unmap_regs:
+   iounmap(currentpar->regbase);
+out_free_fb:
  

[PATCH v8 2/9] video: fbdev: i740fb: use arch_phys_wc_add() and pci_ioremap_wc_bar()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Convert the driver from using the x86 specific MTRR code to
the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add()
will avoid MTRR if write-combining is available, in order to
take advantage of that also ensure the ioremap'd area is requested
as write-combining.

There are a few motivations for this:

a) Take advantage of PAT when available

b) Help bury MTRR code away, MTRR is architecture specific and on
   x86 its replaced by PAT

c) Help with the goal of eventually using _PAGE_CACHE_UC over
   _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit
   de33c442e titled "x86 PAT: fix performance drop for glx,
   use UC minus for ioremap(), ioremap_nocache() and
   pci_mmap_page_range()")

The conversion done is expressed by the following Coccinelle
SmPL patch, it additionally required manual intervention to
address all the #ifdery and removal of redundant things which
arch_phys_wc_add() already addresses such as verbose message
about when MTRR fails and doing nothing when we didn't get
an MTRR.

@ mtrr_found @
expression index, base, size;
@@

-index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1);
+index = arch_phys_wc_add(base, size);

@ mtrr_rm depends on mtrr_found @
expression mtrr_found.index, mtrr_found.base, mtrr_found.size;
@@

-mtrr_del(index, base, size);
+arch_phys_wc_del(index);

@ mtrr_rm_zero_arg depends on mtrr_found @
expression mtrr_found.index;
@@

-mtrr_del(index, 0, 0);
+arch_phys_wc_del(index);

@ mtrr_rm_fb_info depends on mtrr_found @
struct fb_info *info;
expression mtrr_found.index;
@@

-mtrr_del(index, info->fix.smem_start, info->fix.smem_len);
+arch_phys_wc_del(index);

@ ioremap_replace_nocache depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap_nocache(base, size);
+info->screen_base = ioremap_wc(base, size);

@ ioremap_replace_default depends on mtrr_found @
struct fb_info *info;
expression base, size;
@@

-info->screen_base = ioremap(base, size);
+info->screen_base = ioremap_wc(base, size);

Generated-by: Coccinelle SmPL
Cc: Jingoo Han 
Cc: Bjorn Helgaas 
Cc: Geert Uytterhoeven 
Cc: Rob Clark 
Cc: Benoit Taine 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Tomi Valkeinen 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/video/fbdev/i740fb.c | 35 ++-
 1 file changed, 6 insertions(+), 29 deletions(-)

diff --git a/drivers/video/fbdev/i740fb.c b/drivers/video/fbdev/i740fb.c
index a2b4204..452e116 100644
--- a/drivers/video/fbdev/i740fb.c
+++ b/drivers/video/fbdev/i740fb.c
@@ -27,24 +27,15 @@
 #include 
 #include 
 
-#ifdef CONFIG_MTRR
-#include 
-#endif
-
 #include "i740_reg.h"
 
 static char *mode_option;
-
-#ifdef CONFIG_MTRR
 static int mtrr = 1;
-#endif
 
 struct i740fb_par {
unsigned char __iomem *regs;
bool has_sgram;
-#ifdef CONFIG_MTRR
-   int mtrr_reg;
-#endif
+   int wc_cookie;
bool ddc_registered;
struct i2c_adapter ddc_adapter;
struct i2c_algo_bit_data ddc_algo;
@@ -1040,7 +1031,7 @@ static int i740fb_probe(struct pci_dev *dev, const struct 
pci_device_id *ent)
goto err_request_regions;
}
 
-   info->screen_base = pci_ioremap_bar(dev, 0);
+   info->screen_base = pci_ioremap_wc_bar(dev, 0);
if (!info->screen_base) {
dev_err(info->device, "error remapping base\n");
ret = -ENOMEM;
@@ -1144,13 +1135,9 @@ static int i740fb_probe(struct pci_dev *dev, const 
struct pci_device_id *ent)
 
fb_info(info, "%s frame buffer device\n", info->fix.id);
pci_set_drvdata(dev, info);
-#ifdef CONFIG_MTRR
-   if (mtrr) {
-   par->mtrr_reg = -1;
-   par->mtrr_reg = mtrr_add(info->fix.smem_start,
-   info->fix.smem_len, MTRR_TYPE_WRCOMB, 1);
-   }
-#endif
+   if (mtrr)
+   par->wc_cookie = arch_phys_wc_add(info->fix.smem_start,
+ info->fix.smem_len);
return 0;
 
 err_reg_framebuffer:
@@ -1177,13 +1164,7 @@ static void i740fb_remove(struct pci_dev *dev)
 
if (info) {
struct i740fb_par *par = info->par;
-
-#ifdef CONFIG_MTRR
-   if (par->mtrr_reg >= 0) {
-   mtrr_del(par->mtrr_reg, 0, 0);
-   par->mtrr_reg = -1;
-   }
-#endif
+   arch_phys_wc_del(par->wc_cookie);
unregister_framebuffer(info);
fb_dealloc_cmap(>cmap);
if (par->ddc_registered)
@@ -1287,10 +1268,8 @@ static int  __init i740fb_setup(char *options)
while ((opt = strsep(, ",")) != NULL) {
if (!*opt)
continue;
-#ifdef 

[PATCH v8 1/9] pci: add pci_ioremap_wc_bar()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This lets drivers take advantage of PAT when available. This
should help with the transition of converting video drivers over
to ioremap_wc() to help with the goal of eventually using
_PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
ioremap_nocache() (de33c442e titled "x86 PAT: fix performance
drop for glx, use UC minus for ioremap(), ioremap_nocache() and
pci_mmap_page_range()")

Cc: Toshi Kani 
Cc: Bjorn Helgaas 
Cc: Suresh Siddha 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Juergen Gross 
Cc: Daniel Vetter 
Cc: Andy Lutomirski 
Cc: Dave Airlie 
Cc: Antonino Daplas 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: Ville Syrjälä 
Cc: Mel Gorman 
Cc: Vlastimil Babka 
Cc: Borislav Petkov 
Cc: Davidlohr Bueso 
Cc: linux-fb...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Arnd Bergmann 
Signed-off-by: Luis R. Rodriguez 
---
 drivers/pci/pci.c   | 14 ++
 include/linux/pci.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0008c95..fdae37b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -138,6 +138,20 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int 
bar)
return ioremap_nocache(res->start, resource_size(res));
 }
 EXPORT_SYMBOL_GPL(pci_ioremap_bar);
+
+void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar)
+{
+   /*
+* Make sure the BAR is actually a memory resource, not an IO resource
+*/
+   if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) {
+   WARN_ON(1);
+   return NULL;
+   }
+   return ioremap_wc(pci_resource_start(pdev, bar),
+ pci_resource_len(pdev, bar));
+}
+EXPORT_SYMBOL_GPL(pci_ioremap_wc_bar);
 #endif
 
 #define PCI_FIND_CAP_TTL   48
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c0dd4ab..1193975 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1657,6 +1657,7 @@ static inline void pci_mmcfg_late_init(void) { }
 int pci_ext_cfg_avail(void);
 
 void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
+void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
 
 #ifdef CONFIG_PCI_IOV
 int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v8 0/9] pci: add pci_iomap_wc() and pci_ioremap_wc_bar()

2015-06-24 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Boris,

This patchset is part of the long haul of series that addresses removal of
direct use of MTRR and transforms drivers over to use PAT interfaces when
available [0]. Other than this series there is only one more pending series for
that effort, the other one being the atyfb device driver specific changes which
no one has replied to for over one month and I'll soon repost and hope that
Andrew might pick up. The patches in this series were originally split in two
series but I've combined them now given all Acks have been collected and they
are all related. Tomi has provided his Acked-by for all device driver changes.
Bjorn had originally reviewed this series and was comfortable with all the code
except for the use of EXPORT_SYMBOL_GPL() despite new clarifications of how we
can use this for new symbols and our preference for it on new PAT interfaces
[1], despite this Bjorn has clarified he's comfortable with this going in
through another maintainer and in particular Arnd [2]. The v7 series was posted
addressing Arnd, Arnd provided his Acked-by for all PCI and devres changes but
noted he's on parental leave and not taking any patches for arm-soc or
asm-generic until he's back at work in around 3 months from now [2] so he
suggested to see if I could find another maintainer to have these go through.

This v8 goes unmodified, except for the devres commit, since those routines
are not yet used by any device driver for now I've just skipped exporting
the symbols but did note that if they will be it must be exported with
EXPORT_SYMBOL_GPL(). Once we have a driver need them upstream we can export
these.

Although I had test compiled this before just to be safe I went ahead and
successfully test-compiled this set with allmodconfig, specially since I've now
removed the exports for the devres routines.  Please let me know if these might
be able to go through you or if there are any questions. I will note the recent
discussion with Benjamin over the v7 series concluded that the ideas we both
were alluding to, on automating instead the WC effects for devices seems a bit
too idealistic for PCI / PCIE for now, but perhaps we should at least consider
this in the future for userspace mmap() calls [4].

[0] 
http://lkml.kernel.org/r/CAB=NE6UgtdSoBsA=8+ueyrazhdnwusmqaohhaaefqudbrsy...@mail.gmail.com
[1] 
http://lkml.kernel.org/r/caerspo4sha-f83x1nw2qdlt9gdubfxcq7uejmsffc5gbjj8...@mail.gmail.com
[2] 
http://lkml.kernel.org/r/caerspo7cnh1wpgqjceu8etxifnp_piq3cbwnkiwqpuad-fd...@mail.gmail.com
[3] http://lkml.kernel.org/r/1435193521.3790.26.ca...@kernel.crashing.org

Luis R. Rodriguez (9):
  pci: add pci_ioremap_wc_bar()
  video: fbdev: i740fb: use arch_phys_wc_add() and pci_ioremap_wc_bar()
  video: fbdev: kyrofb: use arch_phys_wc_add() and pci_ioremap_wc_bar()
  video: fbdev: gxt4500: use pci_ioremap_wc_bar() for framebuffer
  PCI: Add pci_iomap_wc() variants
  lib: devres: add pcim_iomap_wc() variants
  video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc()
  video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc()
  video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc()

 drivers/pci/pci.c| 14 
 drivers/video/fbdev/arkfb.c  | 36 +++
 drivers/video/fbdev/gxt4500.c|  2 +-
 drivers/video/fbdev/i740fb.c | 35 --
 drivers/video/fbdev/kyro/fbdev.c | 33 ++---
 drivers/video/fbdev/s3fb.c   | 35 --
 drivers/video/fbdev/vt8623fb.c   | 31 
 include/asm-generic/pci_iomap.h  | 14 
 include/linux/pci.h  |  3 ++
 include/video/kyro.h |  4 +--
 lib/devres.c | 76 
 lib/pci_iomap.c  | 61 
 12 files changed, 204 insertions(+), 140 deletions(-)

-- 
2.3.2.209.gd67f9d5.dirty

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/6] x86: document and address MTRR corner cases

2015-06-24 Thread Luis R. Rodriguez
On Fri, Jun 19, 2015 at 3:22 PM, Luis R. Rodriguez  wrote:
> Tomi, Dave, Andy,
>
> Its' been one month now since posting the last unmodified version
> (other than commit log) of this series [0] and no word or follow up
> from Ville. The merge window is closing in and other than the PCI
> changes this would be the last pending series. Can I trouble one of
> you for your review ? I will note that this series depends on the
> ioremap_uc() which went in through Ingo's tree and visible on
> linux-next.
>
> [0] http://lkml.kernel.org/r/20150529174051.gc23...@wotan.suse.de

Alright, I'll poke to see if Andrew might take these then. I'll post a
new clean series just to be crystal clear as this is a complex set, I
admit and it may be worth re-iterating things.

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/12] [media] dvb-pll: Add support for THOMSON DTT7546X tuner.

2015-06-24 Thread Joe Perches
On Wed, 2015-06-24 at 16:11 +0100, Peter Griffin wrote:
> This is used in conjunction with the STV0367 demodulator on
> the STV0367-NIM-V1.0 NIM card which can be used with the STi
> STB SoC's.

Barely associated to this specific patch, but for
dvb-pll.c, another thing that seems possible is to
convert the struct dvb_pll_desc uses to const and
change the "entries" fixed array size from 12 to []

It'd save a couple KB overall and remove ~5KB of data.

$ size drivers/media/dvb-frontends/dvb-pll.o*
   textdata bss dec hex filename
   852015522120   121922fa0 
drivers/media/dvb-frontends/dvb-pll.o.new
   562463632120   14107371b 
drivers/media/dvb-frontends/dvb-pll.o.old
---
 drivers/media/dvb-frontends/dvb-pll.c | 50 +--
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/media/dvb-frontends/dvb-pll.c 
b/drivers/media/dvb-frontends/dvb-pll.c
index 6d8fe88..53089e1 100644
--- a/drivers/media/dvb-frontends/dvb-pll.c
+++ b/drivers/media/dvb-frontends/dvb-pll.c
@@ -34,7 +34,7 @@ struct dvb_pll_priv {
struct i2c_adapter *i2c;
 
/* the PLL descriptor */
-   struct dvb_pll_desc *pll_desc;
+   const struct dvb_pll_desc *pll_desc;
 
/* cached frequency/bandwidth */
u32 frequency;
@@ -57,7 +57,7 @@ MODULE_PARM_DESC(id, "force pll id to use (DEBUG ONLY)");
 /* --- */
 
 struct dvb_pll_desc {
-   char *name;
+   const char *name;
u32  min;
u32  max;
u32  iffreq;
@@ -71,13 +71,13 @@ struct dvb_pll_desc {
u32 stepsize;
u8  config;
u8  cb;
-   } entries[12];
+   } entries[];
 };
 
 /* --- */
 /* descriptions*/
 
-static struct dvb_pll_desc dvb_pll_thomson_dtt7579 = {
+static const struct dvb_pll_desc dvb_pll_thomson_dtt7579 = {
.name  = "Thomson dtt7579",
.min   = 17700,
.max   = 85800,
@@ -99,7 +99,7 @@ static void thomson_dtt759x_bw(struct dvb_frontend *fe, u8 
*buf)
buf[3] |= 0x10;
 }
 
-static struct dvb_pll_desc dvb_pll_thomson_dtt759x = {
+static const struct dvb_pll_desc dvb_pll_thomson_dtt759x = {
.name  = "Thomson dtt759x",
.min   = 17700,
.max   = 89600,
@@ -123,7 +123,7 @@ static void thomson_dtt7520x_bw(struct dvb_frontend *fe, u8 
*buf)
buf[3] ^= 0x10;
 }
 
-static struct dvb_pll_desc dvb_pll_thomson_dtt7520x = {
+static const struct dvb_pll_desc dvb_pll_thomson_dtt7520x = {
.name  = "Thomson dtt7520x",
.min   = 18500,
.max   = 9,
@@ -141,7 +141,7 @@ static struct dvb_pll_desc dvb_pll_thomson_dtt7520x = {
},
 };
 
-static struct dvb_pll_desc dvb_pll_lg_z201 = {
+static const struct dvb_pll_desc dvb_pll_lg_z201 = {
.name  = "LG z201",
.min   = 17400,
.max   = 86200,
@@ -157,7 +157,7 @@ static struct dvb_pll_desc dvb_pll_lg_z201 = {
},
 };
 
-static struct dvb_pll_desc dvb_pll_unknown_1 = {
+static const struct dvb_pll_desc dvb_pll_unknown_1 = {
.name  = "unknown 1", /* used by dntv live dvb-t */
.min   = 17400,
.max   = 86200,
@@ -179,7 +179,7 @@ static struct dvb_pll_desc dvb_pll_unknown_1 = {
 /* Infineon TUA6010XS
  * used in Thomson Cable Tuner
  */
-static struct dvb_pll_desc dvb_pll_tua6010xs = {
+static const struct dvb_pll_desc dvb_pll_tua6010xs = {
.name  = "Infineon TUA6010XS",
.min   =  4425,
.max   = 85800,
@@ -193,7 +193,7 @@ static struct dvb_pll_desc dvb_pll_tua6010xs = {
 };
 
 /* Panasonic env57h1xd5 (some Philips PLL ?) */
-static struct dvb_pll_desc dvb_pll_env57h1xd5 = {
+static const struct dvb_pll_desc dvb_pll_env57h1xd5 = {
.name  = "Panasonic ENV57H1XD5",
.min   =  4425,
.max   = 85800,
@@ -217,7 +217,7 @@ static void tda665x_bw(struct dvb_frontend *fe, u8 *buf)
buf[3] |= 0x08;
 }
 
-static struct dvb_pll_desc dvb_pll_tda665x = {
+static const struct dvb_pll_desc dvb_pll_tda665x = {
.name  = "Philips TDA6650/TDA6651",
.min   =  4425,
.max   = 85800,
@@ -251,7 +251,7 @@ static void tua6034_bw(struct dvb_frontend *fe, u8 *buf)
buf[3] |= 0x08;
 }
 
-static struct dvb_pll_desc dvb_pll_tua6034 = {
+static const struct dvb_pll_desc dvb_pll_tua6034 = {
.name  = "Infineon TUA6034",
.min   =  4425,
.max   = 85800,
@@ -275,7 +275,7 @@ static void tded4_bw(struct dvb_frontend *fe, u8 *buf)
buf[3] |= 0x04;
 }
 
-static struct dvb_pll_desc dvb_pll_tded4 = {
+static const struct dvb_pll_desc dvb_pll_tded4 = {
.name = "ALPS TDED4",
.min = 4700,
.max = 86300,
@@ -293,7 +293,7 @@ static struct dvb_pll_desc 

Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Benjamin Herrenschmidt
On Wed, 2015-06-24 at 17:58 -0700, Luis R. Rodriguez wrote:
> On Wed, Jun 24, 2015 at 5:52 PM, Benjamin Herrenschmidt
>  wrote:
> > On Thu, 2015-06-25 at 02:08 +0200, Luis R. Rodriguez wrote:
> >>
> >> OK thanks I'll proceed with these patches then.
> >>
> >> > As for user mappings,
> >>
> >> Which APIs were you considering in this regard BTW?
> >
> > mmap of the generic /sys/bus/pci/.../resource*
> 
> Like? Got a demo patch in mind ? :)

Nope. I was just thinking out loud. Today I have yet to see a problem
with what we do so ...

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 03/28] ACPICA: Hardware: Enable 64-bit firmware waking vector for selected FACS.

2015-06-24 Thread Zheng, Lv
Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Sent: Wednesday, June 24, 2015 10:06 PM
> 
> On Wednesday, June 24, 2015 11:02:10 AM Lv Zheng wrote:
> > ACPICA commit 7aa598d711644ab0de5f70ad88f1e2de253115e4
> >
> > The following commit is reported to have broken s2ram on some platforms:
> >  Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd
> >  ACPICA: Add option to favor 32-bit FADT addresses.
> > The platform reports 2 FACS tables (which is not allowed by ACPI
> > specification) and the new 32-bit address favor rule forces OSPMs to use
> > the FACS table reported via FADT's X_FIRMWARE_CTRL field.
> >
> > The root cause of the reported bug might be one of the followings:
> > 1. BIOS may favor the 64-bit firmware waking vector address when the
> >version of the FACS is greater than 0 and Linux currently only supports
> >resuming from the real mode, so the 64-bit firmware waking vector has
> >never been set and might be invalid to BIOS while the commit enables
> >higher version FACS.
> > 2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
> >FADT while the commit doesn't set the firmware waking vector address of
> >the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
> >vector address of the FACS reported by "X_FIRMWARE_CTRL".
> >
> > This patch excludes the cases that can trigger the bugs caused by the root
> > cause 1.
> >
> > ACPI specification says:
> > A. 32-bit FACS address (FIRMWARE_CTRL field in FADT):
> >Physical memory address of the FACS, where OSPM and firmware exchange
> >control information.
> >If the X_FIRMWARE_CTRL field contains a non zero value then this field
> >must be zero.
> >A zero value indicates that no FACS is specified by this field.
> > B. 64-bit FACS address (X_FIRMWARE_CTRL field in FADT):
> >64bit physical memory address of the FACS.
> >This field is used when the physical address of the FACS is above 4GB.
> >If the FIRMWARE_CTRL field contains a non zero value then this field
> >must be zero.
> >A zero value indicates that no FACS is specified by this field.
> > Thus the 32bit and 64bit firmware waking vector should indicate completely
> > different resuming environment - real mode (1MB addressable) and non real
> > mode (4GB+ addressable) and currently Linux only supports resuming from
> > real mode.
> >
> > This patch enables 64-bit firmware waking vector for selected FACS via
> > acpi_set_firmware_waking_vector() so that it's up to OSPMs to determine 
> > which
> > resuming mode should be used by BIOS and ACPICA changes won't trigger the
> > bugs caused by the root cause 1. For example, Linux can pass
> > physical_address64=0 as the parameter of acpi_set_firmware_waking_vector() 
> > to
> > indicate no 64bit waking vector support. Lv Zheng.
> >
> > This patch also updates acpi_set_firmware_waking_vector() invocations in
> > order to keep 32-bit firmware waking vector favor for Linux. 64-bit
> > firmware waking vector has never been enabled by Linux.  The
> > (acpi_physical_address)0 for 64-bit address can be used to force ACPICA to
> > set only 32-bit firmware waking vector for Linux.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
> > Link: https://github.com/acpica/acpica/commit/7aa598d7
> > Cc: 3.14.1+  # 3.14.1+
> > Reported-and-tested-by: Oswald Buddenhagen 
> > Signed-off-by: Lv Zheng 
> > Signed-off-by: Bob Moore 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: x...@kernel.org
> > Cc: Tony Luck 
> > Cc: Fenghua Yu 
> > Cc: linux-i...@vger.kernel.org
> > ---
> >  arch/ia64/include/asm/acpi.h|3 +-
> >  arch/ia64/kernel/acpi.c |2 --
> >  arch/x86/include/asm/acpi.h |3 +-
> >  drivers/acpi/acpica/hwxfsleep.c |   61 
> > ---
> >  drivers/acpi/sleep.c|8 +++--
> >  include/acpi/acpixf.h   |   11 +++
> >  6 files changed, 33 insertions(+), 55 deletions(-)
> >
> > diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h
> > index aa0fdf1..0ac4fab 100644
> > --- a/arch/ia64/include/asm/acpi.h
> > +++ b/arch/ia64/include/asm/acpi.h
> > @@ -79,7 +79,8 @@ int acpi_gsi_to_irq (u32 gsi, unsigned int *irq);
> >  /* Low-level suspend routine. */
> >  extern int acpi_suspend_lowlevel(void);
> >
> > -extern unsigned long acpi_wakeup_address;
> > +#define acpi_wakeup_address((acpi_physical_address)0)
> > +#define acpi_wakeup_address64  ((acpi_physical_address)0)
> >
> >  /*
> >   * Record the cpei override flag and current logical cpu. This is
> > diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> > index b1698bc..1b08d6f 100644
> > --- a/arch/ia64/kernel/acpi.c
> > +++ b/arch/ia64/kernel/acpi.c
> > @@ -60,8 +60,6 @@ int acpi_lapic;
> >  unsigned int acpi_cpei_override;
> >  unsigned int acpi_cpei_phys_cpuid;
> >
> > -unsigned long acpi_wakeup_address = 0;
> > -
> >  #ifdef 

Re: [PATCH] sched: split sched_switch trace event into two

2015-06-24 Thread Steven Rostedt
On Wed, 24 Jun 2015 16:19:33 -0700
Cong Wang  wrote:

> For compatibility, the sched_switch event is not touched.

Yes, and sched_out() should not be added.

> 
> Cc: Steven Rostedt 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Signed-off-by: Cong Wang 
> Signed-off-by: Cong Wang 
> ---
>  include/trace/events/sched.h | 51 
> +++-
>  kernel/sched/core.c  |  2 ++
>  2 files changed, 52 insertions(+), 1 deletion(-)
> 
> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index d57a575..c31f1e0 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -112,8 +112,57 @@ static inline long __trace_sched_switch_state(struct 
> task_struct *p)
>  #endif /* CREATE_TRACE_POINTS */
>  
>  /*
> - * Tracepoint for task switches, performed by the scheduler:
> + * Tracepoints for task switches, performed by the scheduler:
>   */
> +TRACE_EVENT(sched_out,
> +
> + TP_PROTO(struct task_struct *curr),
> +
> + TP_ARGS(curr),
> +
> + TP_STRUCT__entry(
> + __array(char,   comm,   TASK_COMM_LEN   )
> + __field(int,prio)
> + __field(long,   state   )
> + ),
> +
> + TP_fast_assign(
> + __entry->prio   = curr->prio;
> + __entry->state  = __trace_sched_switch_state(curr);
> + memcpy(__entry->comm, curr->comm, TASK_COMM_LEN);
> + ),
> +
> + TP_printk("comm=%s prio=%d state=%s%s",
> + __entry->comm, __entry->prio,
> + __entry->state & (TASK_STATE_MAX-1) ?
> +   __print_flags(__entry->state & (TASK_STATE_MAX-1), "|",
> + { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" },
> + { 16, "Z" }, { 32, "X" }, { 64, "x" },
> + { 128, "K" }, { 256, "W" }, { 512, "P" },
> + { 1024, "N" }) : "R",
> + __entry->state & TASK_STATE_MAX ? "+" : "")
> +);
> +
> +TRACE_EVENT(sched_in,
> +
> + TP_PROTO(struct task_struct *next),
> +
> + TP_ARGS(next),
> +
> + TP_STRUCT__entry(
> + __array(char,   comm,   TASK_COMM_LEN   )
> + __field(int,prio)
> + ),
> +
> + TP_fast_assign(
> + memcpy(__entry->comm, next->comm, TASK_COMM_LEN);
> + __entry->prio   = next->prio;
> + ),
> +
> + TP_printk("comm=%s prio=%d",
> + __entry->comm, __entry->prio)
> +);
> +
>  TRACE_EVENT(sched_switch,
>  
>   TP_PROTO(struct task_struct *prev,
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index c86935a..681fc50 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2219,6 +2219,7 @@ prepare_task_switch(struct rq *rq, struct task_struct 
> *prev,
>   struct task_struct *next)
>  {
>   trace_sched_switch(prev, next);
> + trace_sched_out(prev);

Tracepoints are low overhead, but they do take up space. This is a
useless tracepoint. If anything, I'll work on adding an alias or
something. But please don't add a tracepoint next to a tracepoint that
encompasses the data.

>   sched_info_switch(rq, prev, next);
>   perf_event_task_sched_out(prev, next);
>   fire_sched_out_preempt_notifiers(prev, next);
> @@ -2288,6 +2289,7 @@ static struct rq *finish_task_switch(struct task_struct 
> *prev)
>   }
>  
>   tick_nohz_task_switch(current);
> + trace_sched_in(current);

Why not have a:

sched_switch_post(prev, current);

That way, the hook can be useful for other tools.

-- Steve

>   return rq;
>  }
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Benjamin Herrenschmidt
On Wed, 2015-06-24 at 18:38 +0200, Luis R. Rodriguez wrote:
> On Wed, Jun 24, 2015 at 08:42:23AM +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2015-06-19 at 15:08 -0700, Luis R. Rodriguez wrote:
> > > From: "Luis R. Rodriguez" 
> > > 
> > > PCI BARs tell us whether prefetching is safe, but they don't say anything
> > > about write combining (WC).  WC changes ordering rules and allows writes 
> > > to
> > > be collapsed, so it's not safe in general to use it on a prefetchable
> > > region.
> > 
> > Well, the PCIe spec at least specifies that a prefetchable BAR also
> > tolerates write merging... 
> 
> How can that be determined and can that be used as a full bullet proof hint
> to enable wc ? And are you sure? :) 

Well, I"m sure the spec says that ;-) But it could be new to PCIe, I
haven't checked legacy PCI.

> Reason all this was stated was to be
> apologetic over why we can't automate this behind the scenes. Otherwise
> we could amend what you stated into the commit log to elaborate on our
> technical apology. Let me know!

At least on powerpc, for mmap of resource to userspace, we take off the
garded bit in the PTE for prefetchable BARs. This has the effect
architecturally of enabling both prefetch and write combine (ie. side
effect) though afaik, the implementations probably don't actually
prefetch. We've done that for years.

In fact we don't have a way to split the notions, it's either G or no G,
which carries both meanings.

Do you have example/case of a device having problems ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM64: smp: Silence suspicious RCU usage with ipi tracepoints

2015-06-24 Thread Steven Rostedt
On Wed, 24 Jun 2015 23:29:30 +0200
Peter Zijlstra  wrote:

> On Wed, Jun 24, 2015 at 01:14:18PM -0700, Stephen Boyd wrote:
> > John Stultz reported an RCU splat on ARM with ipi trace events
> > enabled. It looks like the same problem exists on ARM64.
> > 
> > At this point in the IPI handling path we haven't called
> > irq_enter() yet, so RCU doesn't know that we're about to exit
> > idle and properly warns that we're using RCU from an idle CPU.
> > Use trace_ipi_entry_rcuidle() instead of trace_ipi_entry() so
> > that RCU is informed about our exit from idle.
> 
> I have a problem with $subject. It says 'silence', whereas afaict this
> fixes an actual bug, so it should be 'fixes'.

Agreed, otherwise Acked-by: Steven Rostedt 

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: changing format/size of data in TRACE_EVENT(extlog_mem_event)

2015-06-24 Thread Steven Rostedt
On Wed, 24 Jun 2015 14:56:49 -0700
"Luck, Tony"  wrote:


> So the question is - how can we update the trace event to include these
> new wider fields with the minimum pain to applications that look at it?
> I don't know if there are any other consumers besides rasdaemon at the
> moment ... but we don't want ugly transitions where you have to guess
> which version of the application you need to run to work with a given
> kernel version.

It comes down to if the rasdaemon (and any other user) included the
event_parse.c "library" (it's not a public library yet, and we really
should make it one). Because if it did, it doesn't matter what the
field is, the event descriptions will give the size, and as long as the
name of a field exists, and it doesn't change type (that is, from
integer to string), it should be fine.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 00/16] libnvdimm: non-volatile memory devices

2015-06-24 Thread Toshi Kani
On Wed, 2015-06-17 at 19:13 -0400, Dan Williams wrote:
> A new sub-system in support of non-volatile memory storage devices.
> 
> Stephen, please add libnvdimm-for-next to -next:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm libnvdimm-for-next
> 
> Changes since v6 [1]:
> 
> 1/ Deferred the patches dependent on ->rw_bytes() (BTT - stacked block
>driver, BLK - mmio aperture windows driver, NFIT_TEST - unit test
>infrastructure for all libnvdimm + nfit components) to their own
>patchset. Make the ->rw_bytes() implementation the first patch in
>that series (Christoph)
> 
> 2/ Collected acks from Christoph and Rafael!
> 
> 3/ Add a HAS_IOMEM dependency to CONFIG_BLK_DEV_PMEM following commit
>b6f2098fb708 "block: pmem: Add dependency on HAS_IOMEM" in 4.1-rc8.
> 
> 4/ Move libnvdimm to subsys_initcall() and move arch/x86/kernel/pmem.c
>back to device_initcall().  This allows ACPI_NFIT to be built-in.
>(Linda)
> 
> 5/ Drop the ACPI_DRIVER_ALL_NOTIFY_EVENTS flag in the nfit driver.
>(Rafael)
> 
> 6/ Reference count the nvdimm_drvdata object.  This fixes a bug that was
>found when the unit tests were extended to test disabling an nvdimm
>while a region device still had references to label data.
> 
 :
> Dan Williams (16):
>   e820, efi: add ACPI 6.0 persistent memory types
>   libnvdimm, nfit: initial libnvdimm infrastructure and NFIT support
>   libnvdimm: control character device and nvdimm_bus sysfs attributes
>   libnvdimm, nfit: dimm/memory-devices
>   libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices
>   libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver 
> infrastructure
>   libnvdimm, nfit: regions (block-data-window, persistent memory, 
> volatile memory)
>   libnvdimm: support for legacy (non-aliasing) nvdimms
>   libnvdimm, pmem: move pmem to drivers/nvdimm/
>   libnvdimm, pmem: add libnvdimm support to the pmem driver
>   libnvdimm, nfit: add interleave-set state-tracking infrastructure
>   libnvdimm: namespace indices: read and validate
>   libnvdimm: pmem label sets and namespace instantiation.
>   libnvdimm: blk labels and namespace instantiation
>   libnvdimm: write pmem label set
>   libnvdimm: write blk label set

We have been successfully running this patchset on our NFIT-enabled
prototype systems with pmem.  (Intel example _DSM, label, blk are not
available for testing.)

So for patch 1/16 to 4/15, and 6/16 to 10/16. 

Tested-by: Toshi Kani 

Thanks,
-Toshi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Luis R. Rodriguez
On Wed, Jun 24, 2015 at 5:52 PM, Benjamin Herrenschmidt
 wrote:
> On Thu, 2015-06-25 at 02:08 +0200, Luis R. Rodriguez wrote:
>>
>> OK thanks I'll proceed with these patches then.
>>
>> > As for user mappings,
>>
>> Which APIs were you considering in this regard BTW?
>
> mmap of the generic /sys/bus/pci/.../resource*

Like? Got a demo patch in mind ? :)

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFCv2][PATCH 1/7] fs: optimize inotify/fsnotify code for unwatched files

2015-06-24 Thread Eric Paris
On Wed, 2015-06-24 at 17:16 -0700, Dave Hansen wrote:
> From: Dave Hansen 
> 
> I have a _tiny_ microbenchmark that sits in a loop and writes
> single bytes to a file.  Writing one byte to a tmpfs file is
> around 2x slower than reading one byte from a file, which is a
> _bit_ more than I expecte.  This is a dumb benchmark, but I think
> it's hard to deny that write() is a hot path and we should avoid
> unnecessary overhead there.
> 
> I did a 'perf record' of 30-second samples of read and write.
> The top item in a diffprofile is srcu_read_lock() from
> fsnotify().  There are active inotify fd's from systemd, but
> nothing is actually listening to the file or its part of
> the filesystem.
> 
> I *think* we can avoid taking the srcu_read_lock() for the
> common case where there are no actual marks on the file.
> This means that there will both be nothing to notify for
> *and* implies that there is no need for clearing the ignore
> mask.
> 
> This patch gave a 13.8% speedup in writes/second on my test,
> which is an improvement from the 10.8% that I saw with the
> last version.
> 
> Signed-off-by: Dave Hansen 
> Cc: Andrew Morton 
> Cc: Jan Kara 
> Cc: Al Viro 
> Cc: Eric Paris 
> Cc: John McCutchan 
> Cc: Robert Love 
> Cc: Tim Chen 
> Cc: Andi Kleen 
> Cc: linux-kernel@vger.kernel.org
> ---
> 
>  b/fs/notify/fsnotify.c |   10 ++
>  1 file changed, 10 insertions(+)
> 
> diff -puN fs/notify/fsnotify.c~optimize-fsnotify fs/notify/fsnotify.c
> --- a/fs/notify/fsnotify.c~optimize-fsnotify  2015-06-24 
> 17:14:34.573109264 -0700
> +++ b/fs/notify/fsnotify.c2015-06-24 17:14:34.576109399 -0700
> @@ -213,6 +213,16 @@ int fsnotify(struct inode *to_tell, __u3
>   !(test_mask & to_tell->i_fsnotify_mask) &&
>   !(mnt && test_mask & mnt->mnt_fsnotify_mask))
>   return 0;
> + /*
> +  * Optimization: srcu_read_lock() has a memory barrier which 
> can
> +  * be expensive.  It protects walking the *_fsnotify_marks 
> lists.
> +  * However, if we do not walk the lists, we do not have to 
> do
> +  * SRCU because we have no references to any objects and do 
> not
> +  * need SRCU to keep them "alive".
> +  */
> + if (!to_tell->i_fsnotify_marks.first &&
> + (!mnt || !mnt->mnt_fsnotify_marks.first))
> + return 0;

two useless peeps from the old peanut gallery of long lost

1) should you actually move this check up before the IN_MODIFY check?
This seems like it would be by far the most common case, and you'd save
yourself a bunch of useless conditionals/bit operations.

2) do you want to use hlist_empty(_tell->i_fsnotify_marks) instead,
for readability (and they are static inline, so compiled code is the
same)

It is fine as it is. Don't know how much you want to try to bikeshed...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants

2015-06-24 Thread Benjamin Herrenschmidt
On Thu, 2015-06-25 at 02:08 +0200, Luis R. Rodriguez wrote:
> 
> OK thanks I'll proceed with these patches then.
> 
> > As for user mappings,
> 
> Which APIs were you considering in this regard BTW?

mmap of the generic /sys/bus/pci/.../resource*

> > maybe the right thing to do is to let us do what we do by
> > default with a quirk that can set a flag in pci_dev to disable that
> > behaviour (maybe on a per BAR basis ?).
> 
> That might mean it could restrict userspace WC to require devices
> to have WC parts on a full PCI BAR. Although this is restrictive
> having reviewed most WC uses in the kernel I'd think this would be
> a fair compromise to make, but again, if things are still murky
> perhaps best we kiss this idea good bye for now and hope for it
> to come in on future buses or ammendments (if that's even possible?).
> 
> > I think the common case is that WC works.
> 
> If WC does not I will note one hack which migh be worth mentioning --
> just for
> the record, this was devised as a shortcoming of a device where they
> failed to
> split things properly and that *without* WC performance suffered quite
> a bit so
> they made one full PCI BAR WC and as a work around this:
> 
> http://lkml.kernel.org/r/20150416041837.GA5712@hykim-PC
> 
> That is for registers that needed it:
> 
> write; wmb;
> 
> Then if they wanted to wait till the NIC has seen the write, they did:
> 
> write; wmb; read;
> 

Right, and as I mentioned, on some archs like powerpc (and possibly
more), writel() and co contains an implicit mb()

>   Luis--


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 09/10] mm/compaction: redesign compaction

2015-06-24 Thread Joonsoo Kim
Currently, compaction works as following.
1) migration scanner scans from zone_start_pfn to zone_end_pfn
to find migratable pages
2) free scanner scans from zone_end_pfn to zone_start_pfn to
find free pages
3) If both scanner crossed, compaction is finished.

This algorithm has some drawbacks. 1) Back of the zone cannot be
scanned by migration scanner because migration scanner can't pass
over freepage scanner. So, although there are some high order page
candidates at back of the zone, we can't utilize it.
Another weakness is 2) compaction's success highly depends on amount
of freepage. Compaction can migrate used pages by amount of freepage
at maximum. If we can't make high order page by this effort, both
scanner should meet and compaction will fail.

We can easily observe problem 1) by following test.

Memory is artificially fragmented to make order 3 allocation hard. And,
most of pageblocks are changed to unmovable migratetype.

  System: 512 MB with 32 MB Zram
  Memory: 25% memory is allocated to make fragmentation and 200 MB is
occupied by memory hogger. Most pageblocks are movable
migratetype.
  Fragmentation: Successful order 3 allocation candidates may be around
1500 roughly.
  Allocation attempts: Roughly 3000 order 3 allocation attempts
with GFP_NORETRY. This value is determined to saturate allocation
success.

Test: hogger-frag-movable
nonmovable
compact_free_scanned   5883401
compact_isolated 83201
compact_migrate_scanned2755690
compact_stall  664
compact_success102
pgmigrate_success38663
Success:26
Success(N): 56

Column 'Success' and 'Success(N) are calculated by following equations.

Success = successful allocation * 100 / attempts
Success(N) = successful allocation * 100 /
number of successful order-3 allocation

As mentioned above, there are roughly 1500 high order page candidates,
but, compaction just returns 56% of them, because migration scanner
can't pass over freepage scanner. With new compaction approach, it can
be increased to 94% by this patch.

To check 2), hogger-frag-movable benchmark is used again, but, with some
tweaks. Amount of allocated memory by memory hogger varys.

Test: hogger-frag-movable with free memory variation

bzImage-improve-base
Hogger: 150MB   200MB   250MB   300MB
Success:41  25  17  9
Success(N): 87  53  37  22

As background knowledge, up to 250MB, there is enough
memory to succeed all order-3 allocation attempts. In 300MB case,
available memory before starting allocation attempt is just 57MB,
so all of attempts cannot succeed.

Anyway, as free memory decreases, compaction success rate also decreases.
It is better to remove this dependency to get stable compaction result
in any case.

This patch solves these problems mentioned in above.
Freepage scanner is greatly changed to scan zone from zone_start_pfn
to zone_end_pfn. And, by this change, compaction finish condition is also
changed that migration scanner reach zone_end_pfn. With these changes,
migration scanner can traverse anywhere in the zone.

To prevent back and forth migration within one compaction iteration,
freepage scanner marks skip-bit when scanning pageblock. migration scanner
checks it and will skip this marked pageblock so back and forth migration
cannot be possible in one compaction iteration.

If freepage scanner reachs the end of zone, it restarts at zone_start_pfn.
In this time, freepage scanner would scan the pageblock where migration
scanner try to migrate some pages but fail to make high order page. This
leaved freepages means that they can't become high order page due to
the fragmentation so it is good source for freepage scanner.

With this change, above test result is:

Test: hogger-frag-movable
nonmovable   redesign
compact_free_scanned   58834018103231
compact_isolated 832013108978
compact_migrate_scanned27556904316163
compact_stall  664   2117
compact_success102234
pgmigrate_success386631547318
Success:26 45
Success(N): 56 94

Test: hogger-frag-movable with free memory variation

Hogger: 150MB   200MB   250MB   300MB
bzImage-improve-base
Success:41  25  17  9
Success(N): 87  53  37  22

bzImage-improve-threshold
Success:44  44  42  37
Success(N): 94  92  91  80

Compaction gives us almost all possible high order page. Overhead is
highly increased, but, further patch will reduce it greatly
by 

[RFC PATCH 05/10] mm/compaction: make freepage scanner scans non-movable pageblock

2015-06-24 Thread Joonsoo Kim
Currently, freescanner doesn't scan non-movable pageblock, because if
freepages in non-movable pageblock are exhausted, another movable
pageblock would be used for non-movable allocation and it could cause
fragmentation.

But, we should know that watermark check for compaction doesn't consider
this reality. So, if all freepages are in non-movable pageblock, although,
system has enough freepages and watermark check is passed, freepage
scanner can't get any freepage and compaction will be failed. There is
no way to get precise number of freepage on movable pageblock and no way
to reclaim only used pages in movable pageblock. Therefore, I think
that best way to overcome this situation is to use freepage in non-movable
pageblock in compaction.

My test setup for this situation is:

Memory is artificially fragmented to make order 3 allocation hard. And,
most of pageblocks are changed to unmovable migratetype.

  System: 512 MB with 32 MB Zram
  Memory: 25% memory is allocated to make fragmentation and kernel build
is running on background.
  Fragmentation: Successful order 3 allocation candidates may be around
1500 roughly.
  Allocation attempts: Roughly 3000 order 3 allocation attempts
with GFP_NORETRY. This value is determined to saturate allocation
success.

Below is the result of this test.

Test: build-frag-unmovable
  base nonmovable
compact_free_scanned   50323784110920
compact_isolated 53368 330762
compact_migrate_scanned14565166164677
compact_stall  538746
compact_success 93350
pgmigrate_success19926 152754
Success:15 31
Success(N): 33 65

Column 'Success' and 'Success(N) are calculated by following equations.

Success = successful allocation * 100 / attempts
Success(N) = successful allocation * 100 / order 3 candidate

Result shows that success rate is doubled in this case
because we can search more area.

But, we can observe regression in other case.

Test: stress-highalloc in mmtests
(tweaks to request order-7 unmovable allocation)

Ops 1   30.008.33
Ops 2   32.33   26.67
Ops 3   91.67   92.00
Compaction stalls 51105581
Compaction success17871807
Compaction failures   33233774
Compaction pages isolated  637091115421622
Compaction migrate scanned5268140583721428
Compaction free scanned  418049611   579768237
Compaction cost   37458822

Although this regression is bad, there are also much improvement
in other cases that most of pageblocks are non-movable migratetype.
IMHO, this patch can be justified by this improvement. Moreover,
this regression disappears after applying following patches, so
we don't need to worry about regression much.

Migration scanner already scans non-movable pageblock and make some
freepage in that pageblock through migration. So, even if freepage
scanner scans non-movable pageblock and uses freepage in that pageblock,
number of freepages on non-movable pageblock wouldn't diminish much and
wouldn't cause much fragmentation.

Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index dd2063b..8d1b3b5 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -905,12 +905,8 @@ static bool suitable_migration_target(struct page *page)
return false;
}
 
-   /* If the block is MIGRATE_MOVABLE or MIGRATE_CMA, allow migration */
-   if (migrate_async_suitable(get_pageblock_migratetype(page)))
-   return true;
-
-   /* Otherwise skip the block */
-   return false;
+   /* Otherwise scan the block */
+   return true;
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 00/10] redesign compaction algorithm

2015-06-24 Thread Joonsoo Kim
Recently, I got a report that android get slow due to order-2 page
allocation. With some investigation, I found that compaction usually
fails and many pages are reclaimed to make order-2 freepage. I can't
analyze detailed reason that causes compaction fail because I don't
have reproducible environment and compaction code is changed so much
from that version, v3.10. But, I was inspired by this report and started
to think limitation of current compaction algorithm.

Limitation of current compaction algorithm:

1) Migrate scanner can't scan behind of free scanner, because
each scanner starts at both side of zone and go toward each other. If
they meet at some point, compaction is stopped and scanners' position
is reset to both side of zone again. From my experience, migrate scanner
usually doesn't scan beyond of half of the zone range.

2) Compaction capability is highly depends on amount of free memory.
If there is 50 MB free memory on 4 GB system, migrate scanner can
migrate 50 MB used pages at maximum and then will meet free scanner.
If compaction can't make enough high order freepages during this
amount of work, compaction would fail. There is no way to escape this
failure situation in current algorithm and it will scan same region and
fail again and again. And then, it goes into compaction deferring logic
and will be deferred for some times.

3) Compaction capability is highly depends on migratetype of memory,
because freepage scanner doesn't scan unmovable pageblock.

To investigate compaction limitations, I made some compaction benchmarks.
Base environment of this benchmark is fragmented memory. Before testing,
25% of total size of memory is allocated. With some tricks, these
allocations are evenly distributed to whole memory range. So, after
allocation is finished, memory is highly fragmented and possibility of
successful order-3 allocation is very low. Roughly 1500 order-3 allocation
can be successful. Tests attempt excessive amount of allocation request,
that is, 3000, to find out algorithm limitation.

There are two variations.

pageblock type (unmovable / movable):

One is that most pageblocks are unmovable migratetype and the other is
that most pageblocks are movable migratetype.

memory usage (memory hogger 200 MB / kernel build with -j8):

Memory hogger means that 200 MB free memory is occupied by hogger.
Kernel build means that kernel build is running on background and it
will consume free memory, but, amount of consumption will be very
fluctuated.

With these variations, I made 4 test cases by mixing them.

hogger-frag-unmovable
hogger-frag-movable
build-frag-unmovable
build-frag-movable

All tests are conducted on 512 MB QEMU virtual machine with 8 CPUs.

I can easily check weakness of compaction algorithm by following test.

To check 1), hogger-frag-movable benchmark is used. Result is as
following.

bzImage-improve-base
compact_free_scanned   5240676
compact_isolated   75048
compact_migrate_scanned2468387
compact_stall  710
compact_success98
pgmigrate_success  34869
Success:   25
Success(N):53

Column 'Success' and 'Success(N) are calculated by following equations.

Success = successful allocation * 100 / attempts
Success(N) = successful allocation * 100 /
number of successful order-3 allocation

As mentioned above, there are roughly 1500 high order page candidates,
but, compaction just returns 53% of them. With new compaction approach,
it can be increased to 94%. See result at the end of this cover-letter.

To check 2), hogger-frag-movable benchmark is used again, but, with some
tweaks. Amount of allocated memory by memory hogger varys.

bzImage-improve-base
Hogger: 150MB   200MB   250MB   300MB
Success:41  25  17  9
Success(N): 87  53  37  22

As background knowledge, up to 250MB, there is enough
memory to succeed all order-3 allocation attempts. In 300MB case,
available memory before starting allocation attempt is just 57MB,
so all of attempts cannot succeed.

Anyway, as free memory decreases, compaction success rate also decreases.
It is better to remove this dependency to get stable compaction result
in any case.

To check 3), build-frag-unmovable/movable benchmarks are used.
All factors are same except pageblock migratetypes.

Test: build-frag-unmovable

bzImage-improve-base
compact_free_scanned   5032378
compact_isolated   53368
compact_migrate_scanned1456516
compact_stall  538
compact_success93
pgmigrate_success  19926
Success:   15
Success(N):33

Test: build-frag-movable

bzImage-improve-base
compact_free_scanned   3059086
compact_isolated   129085
compact_migrate_scanned5029856
compact_stall  388
compact_success99

[RFC PATCH 03/10] mm/compaction: always update cached pfn

2015-06-24 Thread Joonsoo Kim
Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/mm/compaction.c b/mm/compaction.c
index 9c5d43c..2d8e211 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -510,6 +510,10 @@ isolate_fail:
if (locked)
spin_unlock_irqrestore(>zone->lock, flags);
 
+   if (blockpfn == end_pfn &&
+   blockpfn > cc->zone->compact_cached_free_pfn)
+   cc->zone->compact_cached_free_pfn = blockpfn;
+
update_pageblock_skip(cc, valid_page, total_isolated,
*start_pfn, end_pfn, blockpfn, false);
 
@@ -811,6 +815,13 @@ isolate_success:
if (locked)
spin_unlock_irqrestore(>lru_lock, flags);
 
+   if (low_pfn == end_pfn && cc->mode != MIGRATE_ASYNC) {
+   int sync = cc->mode != MIGRATE_ASYNC;
+
+   if (low_pfn > zone->compact_cached_migrate_pfn[sync])
+   zone->compact_cached_migrate_pfn[sync] = low_pfn;
+   }
+
update_pageblock_skip(cc, valid_page, nr_isolated,
start_pfn, end_pfn, low_pfn, true);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 07/10] mm/compaction: limit compaction activity in compaction depleted state

2015-06-24 Thread Joonsoo Kim
Compaction deferring was introduced to reduce overhead of compaction
when compaction attempt is expected to fail. But, it has a problem.
Whole zone is rescanned after some compaction attempts are deferred and
this rescan overhead is quite big. And, it imposes large latency to one
random requestor while others will get nearly zero latency to fail due
to deferring compaction. This patch try to handle this situation
differently to solve above problems.

At first, we should know when compaction will fail. Previous patch
defines compaction depleted state. In this state, compaction failure
is highly expected so we don't need to take much effort on compaction.
So, this patch forces migration scanner scan restricted number of pages
in this state. With this way, we can evenly distribute compaction overhead
to all compaction requestors. And, there is a way to escape from
compaction depleted state so we don't need to defer specific number of
compaction attempts unconditionally if compaction possibility recovers.

In this patch, migration scanner limit is defined to imitate current
compaction deferring approach. But, we can tune it easily if this
overhead doesn't look appropriate. It would be further work.

There would be a situation that compactino depleted state is maintained
for a long time. In this case, repeated compaction attempts would cause
useless overhead continually. To optimize this case, this patch introduce
compaction depletion depth and make migration scanner limit diminished
according to this depth. It effectively reduce compaction overhead in
this situation.

Signed-off-by: Joonsoo Kim 
---
 include/linux/mmzone.h |  1 +
 mm/compaction.c| 61 --
 mm/internal.h  |  1 +
 3 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index bd9f1a5..700e9b5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -518,6 +518,7 @@ struct zone {
unsigned intcompact_defer_shift;
int compact_order_failed;
unsigned long   compact_success;
+   unsigned long   compact_depletion_depth;
 #endif
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
diff --git a/mm/compaction.c b/mm/compaction.c
index 9f259b9..aff536f 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -130,6 +130,7 @@ static struct page *pageblock_pfn_to_page(unsigned long 
start_pfn,
 /* Do not skip compaction more than 64 times */
 #define COMPACT_MAX_DEFER_SHIFT 6
 #define COMPACT_MIN_DEPLETE_THRESHOLD 1UL
+#define COMPACT_MIN_SCAN_LIMIT (pageblock_nr_pages)
 
 static bool compaction_depleted(struct zone *zone)
 {
@@ -147,6 +148,48 @@ static bool compaction_depleted(struct zone *zone)
return true;
 }
 
+static void set_migration_scan_limit(struct compact_control *cc)
+{
+   struct zone *zone = cc->zone;
+   int order = cc->order;
+   unsigned long limit;
+
+   cc->migration_scan_limit = LONG_MAX;
+   if (order < 0)
+   return;
+
+   if (!test_bit(ZONE_COMPACTION_DEPLETED, >flags))
+   return;
+
+   if (!zone->compact_depletion_depth)
+   return;
+
+   /* Stop async migration if depleted */
+   if (cc->mode == MIGRATE_ASYNC) {
+   cc->migration_scan_limit = -1;
+   return;
+   }
+
+   /*
+* Deferred compaction restart compaction every 64 compaction
+* attempts and it rescans whole zone range. If we limit
+* migration scanner to scan 1/64 range when depleted, 64
+* compaction attempts will rescan whole zone range as same
+* as deferred compaction.
+*/
+   limit = zone->managed_pages >> 6;
+
+   /*
+* We don't do async compaction. Instead, give extra credit
+* to sync compaction
+*/
+   limit <<= 1;
+   limit = max(limit, COMPACT_MIN_SCAN_LIMIT);
+
+   /* Degradation scan limit according to depletion depth. */
+   limit >>= zone->compact_depletion_depth;
+   cc->migration_scan_limit = max(limit, COMPACT_CLUSTER_MAX);
+}
 /*
  * Compaction is deferred when compaction fails to result in a page
  * allocation success. 1 << compact_defer_limit compactions are skipped up
@@ -243,8 +286,14 @@ static void __reset_isolation_suitable(struct zone *zone)
zone->compact_cached_free_pfn = end_pfn;
zone->compact_blockskip_flush = false;
 
-   if (compaction_depleted(zone))
-   set_bit(ZONE_COMPACTION_DEPLETED, >flags);
+   if (compaction_depleted(zone)) {
+   if (test_bit(ZONE_COMPACTION_DEPLETED, >flags))
+   zone->compact_depletion_depth++;
+   else {
+   set_bit(ZONE_COMPACTION_DEPLETED, >flags);
+   zone->compact_depletion_depth = 0;
+   }
+   }
zone->compact_success = 0;
 
/* Walk the zone and mark 

[RFC PATCH 06/10] mm/compaction: introduce compaction depleted state on zone

2015-06-24 Thread Joonsoo Kim
Further compaction attempt is deferred when some of compaction attempts
already fails. But, after some number of trial are skipped, compaction
restarts work to check whether compaction is now possible or not. It
scans whole range of zone to determine this possibility and if compaction
possibility doesn't recover, this whole range scan is quite big overhead.
As a first step to reduce this overhead, this patch implement compaction
depleted state on zone.

The way to determine depletion of compaction possility is checking number
of success on previous compaction attempt. If number of successful
compaction is below than specified threshold, we guess that compaction
will not successful next time so mark the zone as compaction depleted.
In this patch, threshold is choosed by 1 to imitate current compaction
deferring algorithm. In the following patch, compaction algorithm will be
changed and this threshold is also adjusted to that change.

In this patch, only state definition is implemented. There is no action
for this new state so no functional change. But, following patch will
add some handling for this new state.

Signed-off-by: Joonsoo Kim 
---
 include/linux/mmzone.h |  2 ++
 mm/compaction.c| 38 +++---
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 754c259..bd9f1a5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -517,6 +517,7 @@ struct zone {
unsigned intcompact_considered;
unsigned intcompact_defer_shift;
int compact_order_failed;
+   unsigned long   compact_success;
 #endif
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
@@ -543,6 +544,7 @@ enum zone_flags {
 * many pages under writeback
 */
ZONE_FAIR_DEPLETED, /* fair zone policy batch depleted */
+   ZONE_COMPACTION_DEPLETED,   /* compaction possiblity depleted */
 };
 
 static inline unsigned long zone_end_pfn(const struct zone *zone)
diff --git a/mm/compaction.c b/mm/compaction.c
index 8d1b3b5..9f259b9 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -129,6 +129,23 @@ static struct page *pageblock_pfn_to_page(unsigned long 
start_pfn,
 
 /* Do not skip compaction more than 64 times */
 #define COMPACT_MAX_DEFER_SHIFT 6
+#define COMPACT_MIN_DEPLETE_THRESHOLD 1UL
+
+static bool compaction_depleted(struct zone *zone)
+{
+   unsigned long threshold;
+   unsigned long success = zone->compact_success;
+
+   /*
+* Now, to imitate current compaction deferring approach,
+* choose threshold to 1. It will be changed in the future.
+*/
+   threshold = COMPACT_MIN_DEPLETE_THRESHOLD;
+   if (success >= threshold)
+   return false;
+
+   return true;
+}
 
 /*
  * Compaction is deferred when compaction fails to result in a page
@@ -226,6 +243,10 @@ static void __reset_isolation_suitable(struct zone *zone)
zone->compact_cached_free_pfn = end_pfn;
zone->compact_blockskip_flush = false;
 
+   if (compaction_depleted(zone))
+   set_bit(ZONE_COMPACTION_DEPLETED, >flags);
+   zone->compact_success = 0;
+
/* Walk the zone and mark every pageblock as suitable for isolation */
for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
struct page *page;
@@ -1197,22 +1218,28 @@ static int __compact_finished(struct zone *zone, struct 
compact_control *cc,
bool can_steal;
 
/* Job done if page is free of the right migratetype */
-   if (!list_empty(>free_list[migratetype]))
+   if (!list_empty(>free_list[migratetype])) {
+   zone->compact_success++;
return COMPACT_PARTIAL;
+   }
 
 #ifdef CONFIG_CMA
/* MIGRATE_MOVABLE can fallback on MIGRATE_CMA */
if (migratetype == MIGRATE_MOVABLE &&
-   !list_empty(>free_list[MIGRATE_CMA]))
+   !list_empty(>free_list[MIGRATE_CMA])) {
+   zone->compact_success++;
return COMPACT_PARTIAL;
+   }
 #endif
/*
 * Job done if allocation would steal freepages from
 * other migratetype buddy lists.
 */
if (find_suitable_fallback(area, order, migratetype,
-   true, _steal) != -1)
+   true, _steal) != -1) {
+   zone->compact_success++;
return COMPACT_PARTIAL;
+   }
}
 
return COMPACT_NO_SUITABLE_PAGE;
@@ -1452,6 +1479,11 @@ out:
trace_mm_compaction_end(start_pfn, cc->migrate_pfn,

RE: [PATCH v2 05/28] ACPICA: Hardware: Enable firmware waking vector for both 32-bit and 64-bit FACS.

2015-06-24 Thread Zheng, Lv
Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Sent: Thursday, June 25, 2015 7:57 AM
> 
> On Wednesday, June 24, 2015 11:02:54 AM Lv Zheng wrote:
> > ACPICA commit 368eb60778b27b6ae94d3658ddc902ca1342a963
> > ACPICA commit 70f62a80d65515e1285fdeeb50d94ee6f07df4bd
> >
> > The following commit is reported to have broken s2ram on some platforms:
> >  Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd
> >  ACPICA: Add option to favor 32-bit FADT addresses.
> > The platform reports 2 FACS tables (which is not allowed by ACPI
> > specification) and the new 32-bit address favor rule forces OSPMs to use
> > the FACS table reported via FADT's X_FIRMWARE_CTRL field.
> >
> > The root cause of the reported bug might be one of the followings:
> > 1. BIOS may favor the 64-bit firmware waking vector address when the
> >version of the FACS is greater than 0 and Linux currently only supports
> >resuming from the real mode, so the 64-bit firmware waking vector has
> >never been set and might be invalid to BIOS while the commit enables
> >higher version FACS.
> > 2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
> >FADT while the commit doesn't set the firmware waking vector address of
> >the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
> >vector address of the FACS reported by "X_FIRMWARE_CTRL".
> >
> > This patch excludes the cases that can trigger the bugs caused by the root
> > cause 2.
> >
> > There is no handshaking mechanism can be used by OSPM to tell BIOS which
> > FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still
> > be used by BIOS and the 0 value of the 32-bit firmware waking vector might
> > trigger such failure.
> >
> > This patch enables the firmware waking vectors for both 32bit/64bit FACS
> > tables in order to ensure we can exclude the cases that trigger the bugs
> > caused by the root cause 2. The exclusion is split into 2 commits so that
> > if it turns out not to be necessary, this single commit can be reverted
> > without affecting the useful one. Lv Zheng, Bob Moore.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
> > Link: https://github.com/acpica/acpica/commit/368eb607
> > Link: https://github.com/acpica/acpica/commit/70f62a80
> > Reported-and-tested-by: Oswald Buddenhagen 
> > Signed-off-by: Lv Zheng 
> > Signed-off-by: Bob Moore 
> > ---
> >  drivers/acpi/acpica/acglobal.h  |2 ++
> >  drivers/acpi/acpica/hwxfsleep.c |   74 
> > ---
> >  drivers/acpi/acpica/tbutils.c   |   14 
> >  3 files changed, 71 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/acpi/acpica/acglobal.h b/drivers/acpi/acpica/acglobal.h
> > index a0c4787..53f96a3 100644
> > --- a/drivers/acpi/acpica/acglobal.h
> > +++ b/drivers/acpi/acpica/acglobal.h
> > @@ -61,6 +61,8 @@ ACPI_GLOBAL(struct acpi_table_header, 
> > acpi_gbl_original_dsdt_header);
> >
> >  #if (!ACPI_REDUCED_HARDWARE)
> >  ACPI_GLOBAL(struct acpi_table_facs *, acpi_gbl_FACS);
> > +ACPI_GLOBAL(struct acpi_table_facs *, acpi_gbl_facs32);
> > +ACPI_GLOBAL(struct acpi_table_facs *, acpi_gbl_facs64);
> >
> >  #endif /* !ACPI_REDUCED_HARDWARE */
> >
> > diff --git a/drivers/acpi/acpica/hwxfsleep.c 
> > b/drivers/acpi/acpica/hwxfsleep.c
> > index c67cd32..e273b2e 100644
> > --- a/drivers/acpi/acpica/hwxfsleep.c
> > +++ b/drivers/acpi/acpica/hwxfsleep.c
> > @@ -50,6 +50,13 @@
> >  ACPI_MODULE_NAME("hwxfsleep")
> >
> >  /* Local prototypes */
> > +#if (!ACPI_REDUCED_HARDWARE)
> > +static acpi_status
> > +acpi_hw_set_firmware_waking_vector(struct acpi_table_facs *facs,
> > +  acpi_physical_address physical_address,
> > +  acpi_physical_address physical_address64);
> > +#endif
> > +
> >  static acpi_status acpi_hw_sleep_dispatch(u8 sleep_state, u32 function_id);
> >
> >  /*
> > @@ -79,9 +86,10 @@ static struct acpi_sleep_functions acpi_sleep_dispatch[] 
> > = {
> >  #if (!ACPI_REDUCED_HARDWARE)
> >  
> > /***
> >   *
> > - * FUNCTION:acpi_set_firmware_waking_vector
> > + * FUNCTION:acpi_hw_set_firmware_waking_vector
> >   *
> > - * PARAMETERS:  physical_address- 32-bit physical address of ACPI real 
> > mode
> > + * PARAMETERS:  facs- Pointer to FACS table
> > + *  physical_address- 32-bit physical address of ACPI real 
> > mode
> >   *entry point
> >   *  physical_address64  - 64-bit physical address of ACPI 
> > protected
> >   *entry point
> > @@ -92,11 +100,12 @@ static struct acpi_sleep_functions 
> > acpi_sleep_dispatch[] = {
> >   *
> >   
> > **/
> >
> > -acpi_status
> > 

[RFC PATCH 10/10] mm/compaction: new threshold for compaction depleted zone

2015-06-24 Thread Joonsoo Kim
Now, compaction algorithm become powerful. Migration scanner traverses
whole zone range. So, old threshold for depleted zone which is designed
to imitate compaction deferring approach isn't appropriate for current
compaction algorithm. If we adhere to current threshold, 1, we can't
avoid excessive overhead caused by compaction, because one compaction
for low order allocation would be easily successful in any situation.

This patch re-implements threshold calculation based on zone size and
allocation requested order. We judge whther compaction possibility is
depleted or not by number of successful compaction. Roughly, 1/100
of future scanned area should be allocated for high order page during
one comaction iteration in order to determine whether zone's compaction
possiblity is depleted or not.

Below is test result with following setup.

Memory is artificially fragmented to make order 3 allocation hard. And,
most of pageblocks are changed to unmovable migratetype.

  System: 512 MB with 32 MB Zram
  Memory: 25% memory is allocated to make fragmentation and 200 MB is
occupied by memory hogger. Most pageblocks are unmovable
migratetype.
  Fragmentation: Successful order 3 allocation candidates may be around
1500 roughly.
  Allocation attempts: Roughly 3000 order 3 allocation attempts
with GFP_NORETRY. This value is determined to saturate allocation
success.

Test: hogger-frag-unmovable
  redesign  threshold
compact_free_scanned   64410952235764
compact_isolated   2711081 647701
compact_migrate_scanned41754641697292
compact_stall 2059   2092
compact_success207210
pgmigrate_success  1348113 318395
Success:44 40
Success(N): 90 83

This change results in greatly decreasing compaction overhead when
zone's compaction possibility is nearly depleted. But, I should admit
that it's not perfect because compaction success rate is decreased.
More precise tuning threshold would restore this regression, but,
it highly depends on workload so I'm not doing it here.

Other test doesn't show any regression.

  System: 512 MB with 32 MB Zram
  Memory: 25% memory is allocated to make fragmentation and kernel
build is running on background. Most pageblocks are movable
migratetype.
  Fragmentation: Successful order 3 allocation candidates may be around
1500 roughly.
  Allocation attempts: Roughly 3000 order 3 allocation attempts
with GFP_NORETRY. This value is determined to saturate allocation
success.

Test: build-frag-movable
  redesign  threshold
compact_free_scanned   23595531461131
compact_isolated907515 387373
compact_migrate_scanned37856052177090
compact_stall 2195   2157
compact_success247225
pgmigrate_success   439739 182366
Success:43 43
Success(N): 89 90

Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 99f533f..63702b3 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -129,19 +129,24 @@ static struct page *pageblock_pfn_to_page(unsigned long 
start_pfn,
 
 /* Do not skip compaction more than 64 times */
 #define COMPACT_MAX_FAILED 4
-#define COMPACT_MIN_DEPLETE_THRESHOLD 1UL
+#define COMPACT_MIN_DEPLETE_THRESHOLD 4UL
 #define COMPACT_MIN_SCAN_LIMIT (pageblock_nr_pages)
 
 static bool compaction_depleted(struct zone *zone)
 {
-   unsigned long threshold;
+   unsigned long nr_possible;
unsigned long success = zone->compact_success;
+   unsigned long threshold;
 
-   /*
-* Now, to imitate current compaction deferring approach,
-* choose threshold to 1. It will be changed in the future.
-*/
-   threshold = COMPACT_MIN_DEPLETE_THRESHOLD;
+   nr_possible = zone->managed_pages >> zone->compact_order_failed;
+
+   /* Migration scanner can scans more than 1/4 range of zone */
+   nr_possible >>= 2;
+
+   /* We hope to succeed more than 1/100 roughly */
+   threshold = nr_possible >> 7;
+
+   threshold = max(threshold, COMPACT_MIN_DEPLETE_THRESHOLD);
if (success >= threshold)
return false;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 08/10] mm/compaction: remove compaction deferring

2015-06-24 Thread Joonsoo Kim
Now, we have a way to determine compaction depleted state and compaction
activity will be limited according this state and depletion depth so
compaction overhead would be well controlled without compaction deferring.
So, this patch remove compaction deferring completely.

Various functions are renamed and tracepoint outputs are changed due to
this removing.

Signed-off-by: Joonsoo Kim 
---
 include/linux/compaction.h| 14 +---
 include/linux/mmzone.h|  3 +-
 include/trace/events/compaction.h | 30 +++-
 mm/compaction.c   | 74 ++-
 mm/page_alloc.c   |  2 +-
 mm/vmscan.c   |  4 +--
 6 files changed, 37 insertions(+), 90 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index aa8f61c..8d98f3c 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -45,11 +45,8 @@ extern void reset_isolation_suitable(pg_data_t *pgdat);
 extern unsigned long compaction_suitable(struct zone *zone, int order,
int alloc_flags, int classzone_idx);
 
-extern void defer_compaction(struct zone *zone, int order);
-extern bool compaction_deferred(struct zone *zone, int order);
-extern void compaction_defer_reset(struct zone *zone, int order,
+extern void compaction_failed_reset(struct zone *zone, int order,
bool alloc_success);
-extern bool compaction_restarting(struct zone *zone, int order);
 
 #else
 static inline unsigned long try_to_compact_pages(gfp_t gfp_mask,
@@ -74,15 +71,6 @@ static inline unsigned long compaction_suitable(struct zone 
*zone, int order,
return COMPACT_SKIPPED;
 }
 
-static inline void defer_compaction(struct zone *zone, int order)
-{
-}
-
-static inline bool compaction_deferred(struct zone *zone, int order)
-{
-   return true;
-}
-
 #endif /* CONFIG_COMPACTION */
 
 #if defined(CONFIG_COMPACTION) && defined(CONFIG_SYSFS) && defined(CONFIG_NUMA)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 700e9b5..e13b732 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -514,8 +514,7 @@ struct zone {
 * are skipped before trying again. The number attempted since
 * last failure is tracked with compact_considered.
 */
-   unsigned intcompact_considered;
-   unsigned intcompact_defer_shift;
+   int compact_failed;
int compact_order_failed;
unsigned long   compact_success;
unsigned long   compact_depletion_depth;
diff --git a/include/trace/events/compaction.h 
b/include/trace/events/compaction.h
index 9a6a3fe..323e614 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -239,7 +239,7 @@ DEFINE_EVENT(mm_compaction_suitable_template, 
mm_compaction_suitable,
 );
 
 #ifdef CONFIG_COMPACTION
-DECLARE_EVENT_CLASS(mm_compaction_defer_template,
+DECLARE_EVENT_CLASS(mm_compaction_deplete_template,
 
TP_PROTO(struct zone *zone, int order),
 
@@ -249,8 +249,9 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template,
__field(int, nid)
__field(char *, name)
__field(int, order)
-   __field(unsigned int, considered)
-   __field(unsigned int, defer_shift)
+   __field(unsigned long, success)
+   __field(unsigned long, depletion_depth)
+   __field(int, failed)
__field(int, order_failed)
),
 
@@ -258,35 +259,30 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template,
__entry->nid = zone_to_nid(zone);
__entry->name = (char *)zone->name;
__entry->order = order;
-   __entry->considered = zone->compact_considered;
-   __entry->defer_shift = zone->compact_defer_shift;
+   __entry->success = zone->compact_success;
+   __entry->depletion_depth = zone->compact_depletion_depth;
+   __entry->failed = zone->compact_failed;
__entry->order_failed = zone->compact_order_failed;
),
 
-   TP_printk("node=%d zone=%-8s order=%d order_failed=%d consider=%u 
limit=%lu",
+   TP_printk("node=%d zone=%-8s order=%d failed=%d order_failed=%d 
consider=%lu depth=%lu",
__entry->nid,
__entry->name,
__entry->order,
+   __entry->failed,
__entry->order_failed,
-   __entry->considered,
-   1UL << __entry->defer_shift)
+   __entry->success,
+   __entry->depletion_depth)
 );
 
-DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_deferred,
+DEFINE_EVENT(mm_compaction_deplete_template, mm_compaction_fail_compaction,
 
TP_PROTO(struct zone *zone, int order),
 
TP_ARGS(zone, order)
 );
 

[RFC PATCH 04/10] mm/compaction: clean-up restarting condition check

2015-06-24 Thread Joonsoo Kim
Rename check function and move one outer condition check to this function.
There is no functional change.

Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 2d8e211..dd2063b 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -188,8 +188,11 @@ void compaction_defer_reset(struct zone *zone, int order,
 }
 
 /* Returns true if restarting compaction after many failures */
-bool compaction_restarting(struct zone *zone, int order)
+static bool compaction_direct_restarting(struct zone *zone, int order)
 {
+   if (current_is_kswapd())
+   return false;
+
if (order < zone->compact_order_failed)
return false;
 
@@ -1327,7 +1330,7 @@ static int compact_zone(struct zone *zone, struct 
compact_control *cc)
 * is about to be retried after being deferred. kswapd does not do
 * this reset as it'll reset the cached information when going to sleep.
 */
-   if (compaction_restarting(zone, cc->order) && !current_is_kswapd())
+   if (compaction_direct_restarting(zone, cc->order))
__reset_isolation_suitable(zone);
 
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 01/10] mm/compaction: update skip-bit if whole pageblock is really scanned

2015-06-24 Thread Joonsoo Kim
Scanning pageblock is stopped at the middle of pageblock if enough
pages are isolated. In the next run, it begins again at this position
and if it find that there is no isolation candidate from the middle of
pageblock to end of pageblock, it updates skip-bit. In this case,
scanner doesn't start at begin of pageblock so it is not appropriate
to set skipbit. This patch fixes this situation that updating skip-bit
only happens when whole pageblock is really scanned.

Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 32 ++--
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 6ef2fdf..4397bf7 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -261,7 +261,8 @@ void reset_isolation_suitable(pg_data_t *pgdat)
  */
 static void update_pageblock_skip(struct compact_control *cc,
struct page *page, unsigned long nr_isolated,
-   bool migrate_scanner)
+   unsigned long start_pfn, unsigned long end_pfn,
+   unsigned long curr_pfn, bool migrate_scanner)
 {
struct zone *zone = cc->zone;
unsigned long pfn;
@@ -275,6 +276,13 @@ static void update_pageblock_skip(struct compact_control 
*cc,
if (nr_isolated)
return;
 
+   /* Update the pageblock-skip if the whole pageblock was scanned */
+   if (curr_pfn != end_pfn)
+   return;
+
+   if (start_pfn != round_down(end_pfn - 1, pageblock_nr_pages))
+   return;
+
set_pageblock_skip(page);
 
pfn = page_to_pfn(page);
@@ -300,7 +308,8 @@ static inline bool isolation_suitable(struct 
compact_control *cc,
 
 static void update_pageblock_skip(struct compact_control *cc,
struct page *page, unsigned long nr_isolated,
-   bool migrate_scanner)
+   unsigned long start_pfn, unsigned long end_pfn,
+   unsigned long curr_pfn, bool migrate_scanner)
 {
 }
 #endif /* CONFIG_COMPACTION */
@@ -493,9 +502,6 @@ isolate_fail:
trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn,
nr_scanned, total_isolated);
 
-   /* Record how far we have got within the block */
-   *start_pfn = blockpfn;
-
/*
 * If strict isolation is requested by CMA then check that all the
 * pages requested were isolated. If there were any failures, 0 is
@@ -507,9 +513,11 @@ isolate_fail:
if (locked)
spin_unlock_irqrestore(>zone->lock, flags);
 
-   /* Update the pageblock-skip if the whole pageblock was scanned */
-   if (blockpfn == end_pfn)
-   update_pageblock_skip(cc, valid_page, total_isolated, false);
+   update_pageblock_skip(cc, valid_page, total_isolated,
+   *start_pfn, end_pfn, blockpfn, false);
+
+   /* Record how far we have got within the block */
+   *start_pfn = blockpfn;
 
count_compact_events(COMPACTFREE_SCANNED, nr_scanned);
if (total_isolated)
@@ -806,12 +814,8 @@ isolate_success:
if (locked)
spin_unlock_irqrestore(>lru_lock, flags);
 
-   /*
-* Update the pageblock-skip information and cached scanner pfn,
-* if the whole pageblock was scanned without isolating any page.
-*/
-   if (low_pfn == end_pfn)
-   update_pageblock_skip(cc, valid_page, nr_isolated, true);
+   update_pageblock_skip(cc, valid_page, nr_isolated,
+   start_pfn, end_pfn, low_pfn, true);
 
trace_mm_compaction_isolate_migratepages(start_pfn, low_pfn,
nr_scanned, nr_isolated);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 02/10] mm/compaction: skip useless pfn for scanner's cached pfn

2015-06-24 Thread Joonsoo Kim
Scanner's cached pfn is used to determine the start position of scanner
at next compaction run. Current cached pfn points the skipped pageblock
so we uselessly checks whether pageblock is valid for compaction and
skip-bit is set or not. If we set scanner's cached pfn to next pfn of
skipped pageblock, we don't need to do this check.

Signed-off-by: Joonsoo Kim 
---
 mm/compaction.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 4397bf7..9c5d43c 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -265,7 +265,6 @@ static void update_pageblock_skip(struct compact_control 
*cc,
unsigned long curr_pfn, bool migrate_scanner)
 {
struct zone *zone = cc->zone;
-   unsigned long pfn;
 
if (cc->ignore_skip_hint)
return;
@@ -285,18 +284,16 @@ static void update_pageblock_skip(struct compact_control 
*cc,
 
set_pageblock_skip(page);
 
-   pfn = page_to_pfn(page);
-
/* Update where async and sync compaction should restart */
if (migrate_scanner) {
-   if (pfn > zone->compact_cached_migrate_pfn[0])
-   zone->compact_cached_migrate_pfn[0] = pfn;
+   if (end_pfn > zone->compact_cached_migrate_pfn[0])
+   zone->compact_cached_migrate_pfn[0] = end_pfn;
if (cc->mode != MIGRATE_ASYNC &&
-   pfn > zone->compact_cached_migrate_pfn[1])
-   zone->compact_cached_migrate_pfn[1] = pfn;
+   end_pfn > zone->compact_cached_migrate_pfn[1])
+   zone->compact_cached_migrate_pfn[1] = end_pfn;
} else {
-   if (pfn < zone->compact_cached_free_pfn)
-   zone->compact_cached_free_pfn = pfn;
+   if (start_pfn < zone->compact_cached_free_pfn)
+   zone->compact_cached_free_pfn = start_pfn;
}
 }
 #else
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PULL] Documentation for 4.2

2015-06-24 Thread Jonathan Corbet
The following changes since commit
d4a4f75cd8f29cd9464a5a32e9224a91571d6649:

  Linux 4.1-rc7 (2015-06-07 20:23:50 -0700)

are available in the git repository at:

  git://git.lwn.net/linux-2.6.git tags/docs-for-linus

for you to fetch changes up to 36f95a0b34cb980dcfff9c1082ca5d8f0dc5e78b:

  doc:md: fix typo in md.txt. (2015-06-23 06:49:44 -0600)


Documentation updates for 4.2

The main thing here is Ingo's big subdirectory documenting feature support
for each architecture.  Beyond that, it's the usual pile of fixes, tweaks,
and small additions.


Alexander Kuleshov (1):
  Documentation/kernel-parameters: add missing pciserial to the earlyprintk

Andreas Gruenbacher (1):
  vfs: Minor documentation fix

Anish Bhatt (1):
  kbuild : Fix documentation of INSTALL_HDR_PATH

Baruch Siach (1):
  Documentation/CodingStyle: fix example macro parenthesis imbalance

Ben Hutchings (1):
  firmware: Update information in linux.git about adding firmware

Chen Gang (1):
  Docs: blackfin: Use new switch macro SAMPLE_IRQ_TIMER instead of 
IRQ_TIMER5

Chen Hanxiao (2):
  Docs: proc: fix kernel version
  docs: add VmPMD description in proc

Christoffer Dall (1):
  stable: Update documentation to clarify preferred procedure

Frans Klaver (1):
  Doc: networking: txtimestamp: fix printf format warning

Geert Uytterhoeven (4):
  Documentation/magic-number: Remove SCI_MAGIC
  Documentation/magic-number: Remove SCC_MAGIC
  DMA-API: Spelling s/This/Think/
  gpiolib: Grammar s/an negative/a negative/

H. Nikolaus Schaller (1):
  Documentation usb serial: fixed how to provide vendor and product id

Ingo Molnar (44):
  Documentation/features/vm: Add feature description and arch support 
status file for 'numa-memblock'
  Documentation/features/vm: Add feature description and arch support 
status file for 'PG_uncached'
  Documentation/features/lib: Add feature description and arch support 
status file for 'strncasecmp'
  Documentation/features/io: Add feature description and arch support 
status file for 'sg-chain'
  Documentation/features/vm: Add feature description and arch support 
status file for 'huge-vmap'
  Documentation/features/vm: Add feature description and arch support 
status file for 'pte_special'
  Documentation/features/vm: Add feature description and arch support 
status file for 'pmdp_splitting_flush'
  Documentation/features/debug: Add feature description and arch support 
status file for 'KASAN'
  Documentation/features/time: Add feature description and arch support 
status file for 'modern-timekeeping'
  Documentation/features/time: Add feature description and arch support 
status file for 'virt-cpuacct'
  Documentation/features/time: Add feature description and arch support 
status file for 'irq-time-acct'
  Documentation/features/vm: Add feature description and arch support 
status file for 'THP'
  Documentation/features/locking: Add feature description and arch support 
status file for 'rwsem-optimized'
  Documentation/features/sched: Add feature description and arch support 
status file for 'numa-balancing'
  Documentation/features/io: Add feature description and arch support 
status file for 'dma-contiguous'
  Documentation/features/io: Add feature description and arch support 
status file for 'dma_map_attrs'
  Documentation/features/core: Add feature description and arch support 
status file for 'tracehook'
  Documentation/features/vm: Add feature description and arch support 
status file for 'ioremap_prot'
  Documentation/features/locking: Add feature description and arch support 
status file for 'lockdep'
  Documentation/features/debug: Add feature description and arch support 
status file for 'stackprotector'
  Documentation/features/core: Add feature description and arch support 
status file for 'jump-labels'
  Documentation/features/seccomp: Add feature description and arch support 
status file for 'seccomp-filter'
  Documentation/features/time: Add feature description and arch support 
status file for 'context-tracking'
  Documentation/features/debug: Add feature description and arch support 
status file for 'kgdb'
  Documentation/features/time: Add feature description and arch support 
status file for 'clockevents'
  Documentation/features/vm: Add feature description and arch support 
status file for 'ELF-ASLR'
  Documentation/features/time: Add feature description and arch support 
status file for 'arch-tick-broadcast'
  Documentation/features/debug: Add feature description and arch support 
status file for 'kprobes'
  Documentation/features/debug: Add feature description and arch support 
status file for 'optprobes'
  Documentation/features/debug: Add feature description and arch support 
status 

Re: [RFC PATCH] Fix: x86 unaligned __memcpy to/from virtual memory

2015-06-24 Thread Linus Torvalds
On Wed, Jun 24, 2015 at 4:54 PM, Mathieu Desnoyers
 wrote:
>
> OK, see below. This time the fault occurred at an unaligned address.
> It fails on the !pte_present(*pte_ref) check.

So every time, %rcx is 0x001fb.

Once, your rdx value (which is remaining bytes after the movsq) was 3,
the other two times it's 0.

What's so magical about that 4056-byte copy (+3 bytes once)? Are you
*sure* that copy is valid?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 00/14] crypto: add a new driver for Marvell's CESA

2015-06-24 Thread Javier Martinez Canillas
Hello Paul,

On Thu, Jun 25, 2015 at 2:00 AM, Paul Gortmaker
 wrote:
> [Re: [PATCH v7 00/14] crypto: add a new driver for Marvell's CESA] On 
> 22/06/2015 (Mon 15:59) Herbert Xu wrote:
>
>> On Mon, Jun 22, 2015 at 09:23:36AM +0200, Boris Brezillon wrote:
>> > Hi Herbert,
>> >
>> > On Sun, 21 Jun 2015 16:27:17 +0800
>> > Herbert Xu  wrote:
>> >
>> > > On Sun, Jun 21, 2015 at 10:24:18AM +0200, Boris Brezillon wrote:
>> > > >
>> > > > Indeed. Here is a patch fixing that.
>> > >
>> > > I think you should just kill COMPILE_TEST instead of adding ARM.
>> >
>> > The following patch is killing the COMPILE_TEST dependency.
>>
>> Patch applied.
>
> Just a heads up, this driver is still killing a couple of linux-next
> builds today and for the past few days.
>
> drivers/crypto/mv_cesa.c:1037:2: error: implicit declaration of function
> 'of_get_named_gen_pool' [-Werror=implicit-function-declaration]
>
> http://kisskb.ellerman.id.au/kisskb/buildresult/12448851/
> http://kisskb.ellerman.id.au/kisskb/buildresult/12448776/
>
> Missing dependency on CONFIG_OF_ presumably.
>

I haven't looked at the series but  has a stub
of_get_named_gen_pool() function if CONFIG_OF is not enabled [0].

So it seems that the problem is rather that the header is not being
included in some file.

> Paul.
> --
>

Best regards,
Javier

[0]: http://lxr.free-electrons.com/source/include/linux/genalloc.h#L131
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] Fix: x86 unaligned __memcpy to/from virtual memory

2015-06-24 Thread Mathieu Desnoyers
- On Jun 24, 2015, at 7:54 PM, Mathieu Desnoyers 
mathieu.desnoy...@efficios.com wrote:
> - On Jun 24, 2015, at 3:15 PM, Linus Torvalds 
> torva...@linux-foundation.org
> wrote:
> 
>> On Wed, Jun 24, 2015 at 11:49 AM, Mathieu Desnoyers
>>  wrote:
>>>
>>> Here is the output. I added the printk just after the initial range
>>> check within vmalloc_fault.
>> 
>> Good. Can you add printk's to the error return paths too, so that we
>> see which one it is that triggers.
> 
> OK, see below. This time the fault occurred at an unaligned address.
> It fails on the !pte_present(*pte_ref) check.

I just tried to to a bytewise copy in C rather than call
memcpy, and I got the fault to trigger. So I guess I was on
the wrong track assuming __memcpy would be the culprit.
What is odd is that if I issue vmalloc_sync_all() after each
vmalloc call, the OOPS never triggers. It is clearly a test
case that ends up stressing vfree/vmalloc.

[   34.751984] DEBUG: vmalloc_fault at address 0xc9000729
[   34.753188] DEBUG: !pte_present(*pte_ref) error
[   34.753188] BUG: unable to handle kernel paging request at c9000729
[   34.753188] IP: [] lttng_event_write+0x90/0xd0 
[lttng_ring_buffer_metadata_client]
[   34.753188] PGD 236c92067 PUD 236c93067 PMD b6964067 PTE 0
[   34.753188] Oops:  [#1] SMP 
[   34.753188] Modules linked in: lttng_probe_workqueue(O) 
lttng_probe_vmscan(O) lttng_probe_udp(O) lttng_probe_timer(O) 
lttng_probe_sunrpc(O) lttng_probe_statedump(O) lttng_probe_sock(O) 
lttng_probe_skb(O) lttng_probe_signal(O) lttng_probe_scsi(O) 
lttng_probe_sched(O) lttng_probe_regmap(O) lttng_probe_rcu(O) 
lttng_probe_random(O) lttng_probe_power(O) lttng_probe_net(O) 
lttng_probe_napi(O) lttng_probe_module(O) lttng_probe_kmem(O) 
lttng_probe_jbd2(O) lttng_probe_irq(O) lttng_probe_ext4(O) 
lttng_probe_compaction(O) lttng_probe_block(O) lttng_types(O) 
lttng_ring_buffer_metadata_mmap_client(O) 
lttng_ring_buffer_client_mmap_overwrite(O) 
lttng_ring_buffer_client_mmap_discard(O) lttng_ring_buffer_metadata_client(O) 
lttng_ring_buffer_client_overwrite(O) lttng_ring_buffer_client_discard(O) 
lttng_tracer(O) lttng_statedump(O) lttng_kprobes(O) lttng_lib_ring_buffer(O) 
lttng_kretprobes(O) virtio_blk virtio_net virtio_pci virtio_ring virtio [last 
unloaded: lttng_statedump]
[   34.753188] CPU: 26 PID: 3563 Comm: lttng-consumerd Tainted: G   O   
 4.1.0+ #11
[   34.753188] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
[   34.753188] task: 880234d94880 ti: 88022af6c000 task.ti: 
88022af6c000
[   34.753188] RIP: 0010:[]  [] 
lttng_event_write+0x90/0xd0 [lttng_ring_buffer_metadata_client]
[   34.753188] RSP: 0018:88022af6fda8  EFLAGS: 00010212
[   34.753188] RAX: 009d RBX: 0fd8 RCX: 0025
[   34.753188] RDX: 8800b7681120 RSI: c9000728ff63 RDI: 
[   34.753188] RBP: 88022af6fdb8 R08: 009d R09: 88022ea33025
[   34.753188] R10: 003b R11: 0246 R12: 88022af6fdc8
[   34.753188] R13: 880231565c00 R14: 0fd8 R15: 0fd8
[   34.753188] FS:  7fd64b5f2700() GS:88023754() 
knlGS:
[   34.753188] CS:  0010 DS:  ES:  CR0: 8005003b
[   34.753188] CR2: c9000729 CR3: 000233803000 CR4: 06e0
[   34.753188] Stack:
[   34.753188]  880234cbff00 880234cbff50 88022af6fe48 
a048e060
[   34.753188]  880231565c00  0fd8 
0001
[   34.753188]  88023155d000 0fd8 4025 
4025
[   34.753188] Call Trace:
[   34.753188]  [] lttng_metadata_output_channel+0xd0/0x120 
[lttng_tracer]
[   34.753188]  [] lttng_metadata_ring_buffer_ioctl+0x79/0xd0 
[lttng_tracer]
[   34.753188]  [] do_vfs_ioctl+0x2e0/0x4e0
[   34.753188]  [] ? file_has_perm+0x87/0xa0
[   34.753188]  [] SyS_ioctl+0x81/0xa0
[   34.753188]  [] ? syscall_trace_leave+0xd1/0xe0
[   34.753188]  [] tracesys_phase2+0x84/0x89
[   34.753188] Code: d9 48 0f 47 cb 48 39 cb 75 46 48 8d 57 02 25 ff 0f 00 00 
45 31 c0 48 89 c1 31 c0 48 c1 e2 04 4c 01 ca 66 0f 1f 84 00 00 00 00 00 <44> 0f 
b6 14 06 49 89 c9 4c 03 0a 41 83 c0 01 45 88 14 01 49 63 
[   34.753188] RIP  [] lttng_event_write+0x90/0xd0 
[lttng_ring_buffer_metadata_client]
[   34.753188]  RSP 
[   34.753188] CR2: c9000729
[   34.753188] ---[ end trace 28951381246c3a2e ]---


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 03/28] ACPICA: Hardware: Enable 64-bit firmware waking vector for selected FACS.

2015-06-24 Thread Zheng, Lv
Hi, Rafael

> From: Rafael J. Wysocki [mailto:r...@rjwysocki.net]
> Sent: Thursday, June 25, 2015 7:24 AM
> To: Zheng, Lv
> 
> On Wednesday, June 24, 2015 04:05:42 PM Rafael J. Wysocki wrote:
> > On Wednesday, June 24, 2015 11:02:10 AM Lv Zheng wrote:
> > > ACPICA commit 7aa598d711644ab0de5f70ad88f1e2de253115e4
> > >
> > > The following commit is reported to have broken s2ram on some platforms:
> > >  Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd
> > >  ACPICA: Add option to favor 32-bit FADT addresses.
> > > The platform reports 2 FACS tables (which is not allowed by ACPI
> > > specification) and the new 32-bit address favor rule forces OSPMs to use
> > > the FACS table reported via FADT's X_FIRMWARE_CTRL field.
> > >
> > > The root cause of the reported bug might be one of the followings:
> > > 1. BIOS may favor the 64-bit firmware waking vector address when the
> > >version of the FACS is greater than 0 and Linux currently only supports
> > >resuming from the real mode, so the 64-bit firmware waking vector has
> > >never been set and might be invalid to BIOS while the commit enables
> > >higher version FACS.
> > > 2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
> > >FADT while the commit doesn't set the firmware waking vector address of
> > >the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
> > >vector address of the FACS reported by "X_FIRMWARE_CTRL".
> > >
> > > This patch excludes the cases that can trigger the bugs caused by the root
> > > cause 1.
> > >
> > > ACPI specification says:
> > > A. 32-bit FACS address (FIRMWARE_CTRL field in FADT):
> > >Physical memory address of the FACS, where OSPM and firmware exchange
> > >control information.
> > >If the X_FIRMWARE_CTRL field contains a non zero value then this field
> > >must be zero.
> > >A zero value indicates that no FACS is specified by this field.
> > > B. 64-bit FACS address (X_FIRMWARE_CTRL field in FADT):
> > >64bit physical memory address of the FACS.
> > >This field is used when the physical address of the FACS is above 4GB.
> > >If the FIRMWARE_CTRL field contains a non zero value then this field
> > >must be zero.
> > >A zero value indicates that no FACS is specified by this field.
> > > Thus the 32bit and 64bit firmware waking vector should indicate completely
> > > different resuming environment - real mode (1MB addressable) and non real
> > > mode (4GB+ addressable) and currently Linux only supports resuming from
> > > real mode.
> > >
> > > This patch enables 64-bit firmware waking vector for selected FACS via
> > > acpi_set_firmware_waking_vector() so that it's up to OSPMs to determine 
> > > which
> > > resuming mode should be used by BIOS and ACPICA changes won't trigger the
> > > bugs caused by the root cause 1. For example, Linux can pass
> > > physical_address64=0 as the parameter of 
> > > acpi_set_firmware_waking_vector() to
> > > indicate no 64bit waking vector support. Lv Zheng.
> > >
> > > This patch also updates acpi_set_firmware_waking_vector() invocations in
> > > order to keep 32-bit firmware waking vector favor for Linux. 64-bit
> > > firmware waking vector has never been enabled by Linux.  The
> > > (acpi_physical_address)0 for 64-bit address can be used to force ACPICA to
> > > set only 32-bit firmware waking vector for Linux.
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021
> > > Link: https://github.com/acpica/acpica/commit/7aa598d7
> > > Cc: 3.14.1+  # 3.14.1+
> > > Reported-and-tested-by: Oswald Buddenhagen 
> > > Signed-off-by: Lv Zheng 
> > > Signed-off-by: Bob Moore 
> > > Cc: Thomas Gleixner 
> > > Cc: Ingo Molnar 
> > > Cc: "H. Peter Anvin" 
> > > Cc: x...@kernel.org
> > > Cc: Tony Luck 
> > > Cc: Fenghua Yu 
> > > Cc: linux-i...@vger.kernel.org
> > > ---
> > >  arch/ia64/include/asm/acpi.h|3 +-
> > >  arch/ia64/kernel/acpi.c |2 --
> > >  arch/x86/include/asm/acpi.h |3 +-
> > >  drivers/acpi/acpica/hwxfsleep.c |   61 
> > > ---
> > >  drivers/acpi/sleep.c|8 +++--
> > >  include/acpi/acpixf.h   |   11 +++
> > >  6 files changed, 33 insertions(+), 55 deletions(-)
> > >
> > > diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h
> > > index aa0fdf1..0ac4fab 100644
> > > --- a/arch/ia64/include/asm/acpi.h
> > > +++ b/arch/ia64/include/asm/acpi.h
> > > @@ -79,7 +79,8 @@ int acpi_gsi_to_irq (u32 gsi, unsigned int *irq);
> > >  /* Low-level suspend routine. */
> > >  extern int acpi_suspend_lowlevel(void);
> > >
> > > -extern unsigned long acpi_wakeup_address;
> > > +#define acpi_wakeup_address  ((acpi_physical_address)0)
> > > +#define acpi_wakeup_address64((acpi_physical_address)0)
> > >
> > >  /*
> > >   * Record the cpei override flag and current logical cpu. This is
> > > diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> > > index 

RE: [PATCH v2 02/28] ACPICA: Linuxize: Replace __FUNCTION__ with __func__.

2015-06-24 Thread Zheng, Lv
Hi,

> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Wednesday, June 24, 2015 8:56 PM
> 
> On Wed, Jun 24, 2015 at 11:02:03AM +0800, Lv Zheng wrote:
> > ACPICA commit cb3d1c79f862cd368d749c9b8d9dced40111b0d0
> >
> > __FUNCTION__ is MSVC only, in Linux, it is __func__. Lv Zheng.
> >
> > In ACPICA, this is achieved by string replacement in release script and
> > this patch contains the source code difference between the Linux upstream
> > and ACPICA that is caused by the back porting.
> 
> __func__ is in C99 and never.  __FUNCTION__ is an old extension supported
> by various compilers.

This patch description is used in ACPICA upstream.
For ACPICA code base, __FUNCTION__ is only used for its MSVC builds.
And __func__ is converted from __FUNCTION__ by the linuxize release script.

See the original commit here:
https://github.com/acpica/acpica/commit/cb3d1c79

So this is simply an automated release output.
Without this merged, source code differences between Linux upstream and ACPICA 
upstream will hurt the automation.

Thanks and best regards
-Lv
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] mfd: ChromeOS EC Kconfig dependency cleanup

2015-06-24 Thread Javier Martinez Canillas

Hello,

This is a trivial series that do some changes to the dependency for the
ChromeOS EC drivers Kconfig symbols. The patches are on top of Paul's
patch "mfd: fix dependency warning for CHROME_PLATFORMS on !X86, !ARM":
https://lkml.org/lkml/2015/6/20/219.

Paul fixed a warning about unmet dependencies but I think the correct fix
is to remove unneded dependencies. So that is what this series do and are
composed of the following patches:


Javier Martinez Canillas (3):
  platform/chrome: Don't make CHROME_PLATFORMS depends on X86 || ARM
  mfd: Remove MFD_CROS_EC depends on X86 || ARM
  mfd: Remove MFD_CROS_EC_SPI depends on OF

 drivers/mfd/Kconfig | 3 +--
 drivers/platform/chrome/Kconfig | 1 -
 2 files changed, 1 insertion(+), 3 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] mfd: Remove MFD_CROS_EC_SPI depends on OF

2015-06-24 Thread Javier Martinez Canillas
The ChromeOS EC SPI transport driver has a dependency on OF because it
uses some OF helpers from the  header. But there isn't a
need for an explicit dependency since the header has stub functions if
CONFIG_OF is not defined.

Also, MFD_CROS_EC_SPI already depends on MFD_CROS_EC which in turn has
a dependency on OF so in practice can't be selected without CONFIG_OF.

Signed-off-by: Javier Martinez Canillas 

---

 drivers/mfd/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 653815950aa2..3f68dd251ce8 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -115,7 +115,7 @@ config MFD_CROS_EC_I2C
 
 config MFD_CROS_EC_SPI
tristate "ChromeOS Embedded Controller (SPI)"
-   depends on MFD_CROS_EC && CROS_EC_PROTO && SPI && OF
+   depends on MFD_CROS_EC && CROS_EC_PROTO && SPI
 
---help---
  If you say Y here, you get support for talking to the ChromeOS EC
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] platform/chrome: Don't make CHROME_PLATFORMS depends on X86 || ARM

2015-06-24 Thread Javier Martinez Canillas
The Chrome platform support depends on X86 || ARM because there are
only Chromebooks using those architectures. But only some drivers
depend on a given architecture, and the ones that do already have
a dependency on their specific Kconfig symbol entries.

An option is to also make CHROME_PLATFORMS depends on || COMPILE_TEST
but is more future proof to remove the dependency and let the drivers
be built in all architectures if possible to have more build coverage.

Signed-off-by: Javier Martinez Canillas 
---

 drivers/platform/chrome/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/platform/chrome/Kconfig b/drivers/platform/chrome/Kconfig
index cb1329919527..3271cd1abe7c 100644
--- a/drivers/platform/chrome/Kconfig
+++ b/drivers/platform/chrome/Kconfig
@@ -4,7 +4,6 @@
 
 menuconfig CHROME_PLATFORMS
bool "Platform support for Chrome hardware"
-   depends on X86 || ARM
---help---
  Say Y here to get to see options for platform support for
  various Chromebooks and Chromeboxes. This option alone does
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] mfd: Remove MFD_CROS_EC depends on X86 || ARM

2015-06-24 Thread Javier Martinez Canillas
A dependency on X86 || ARM for MFD_CROS_EC was added to fix the warning:

(MFD_CROS_EC) selects CHROME_PLATFORMS which has unmet direct dependencies (X86 
|| ARM)

This happened because CHROME_PLATFORMS had a dependency on X86 || ARM but
that dependency was removed since there isn't a reason why the option can
not be selected on other architectures. So now the above warning will not
happen and the MFD_CROS_EC dependency can be removed since is not needed.

Signed-off-by: Javier Martinez Canillas 
---

 drivers/mfd/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index d3235e6f1953..653815950aa2 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -94,7 +94,6 @@ config MFD_AXP20X
 
 config MFD_CROS_EC
tristate "ChromeOS Embedded Controller"
-   depends on X86 || ARM
select MFD_CORE
select CHROME_PLATFORMS
select CROS_EC_PROTO
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFCv2][PATCH 7/7] fsnotify: track when ignored mask clearing is needed

2015-06-24 Thread Dave Hansen

From: Dave Hansen 

According to Jan Kara:

You can have ignored mask set without any of the
notification masks set and you are expected to clear the
ignored mask on the first IN_MODIFY event.

But, the only way we currently have to go and find if we need to
do this ignored-mask-clearing is to go through the mark lists
and look for them.  That mark list iteration requires an
srcu_read_lock() which has a memory barrier and can be expensive.

The calculation of 'has_ignore' is pretty cheap because we store
it next to another value which we are updating and we do it
inside of a loop we were already running.

This patch will really only matter when we have a workload where
a file is being modified often _and_ there is an active fsnotify
mark on it.  Otherwise the checks against *_fsnotify.marks.first
will keep us out of the expensive srcu_read_lock() call.

Cc: Jan Kara 
Cc: Alexander Viro 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Paul E. McKenney 
Cc: Tim Chen 
Cc: Andi Kleen 
Signed-off-by: Dave Hansen 
---

 b/fs/notify/fsnotify.c  |   44 ++--
 b/fs/notify/mark.c  |8 +--
 b/include/linux/fsnotify_head.h |1 
 3 files changed, 45 insertions(+), 8 deletions(-)

diff -puN fs/notify/fsnotify.c~fsnotify-ignore-present fs/notify/fsnotify.c
--- a/fs/notify/fsnotify.c~fsnotify-ignore-present  2015-06-24 
17:14:37.187226743 -0700
+++ b/fs/notify/fsnotify.c  2015-06-24 17:14:37.194227057 -0700
@@ -183,6 +183,34 @@ static int send_to_group(struct inode *t
 }
 
 /*
+ * The "logical or" of all of the marks' ->mask is kept in the
+ * i/mnt_fsnotify.mask.  We can check it instead of going
+ * through all of the marks.  fsnotify_recalc_mask() does the
+ * updates.
+ */
+static int some_mark_is_interested(__u32 mask, struct inode *inode, struct 
mount *mnt)
+{
+   if (mask & inode->i_fsnotify.mask)
+   return 1;
+   if (mnt && (mask & mnt->mnt_fsnotify.mask))
+   return 1;
+   return 0;
+}
+
+/*
+ * fsnotify_recalc_mask() recalculates "has_ignore" whenever any
+ * mark's flags change.
+ */
+static int some_mark_needs_ignore_clear(struct inode *inode, struct mount *mnt)
+{
+   if (inode->i_fsnotify.has_ignore)
+   return 1;
+   if (mnt && mnt->mnt_fsnotify.has_ignore)
+   return 1;
+   return 0;
+}
+
+/*
  * This is the main call to fsnotify.  The VFS calls into hook specific 
functions
  * in linux/fsnotify.h.  Those functions then in turn call here.  Here will 
call
  * out to all of the registered fsnotify_group.  Those groups can then use the
@@ -205,14 +233,18 @@ int fsnotify(struct inode *to_tell, __u3
mnt = NULL;
 
/*
-* if this is a modify event we may need to clear the ignored masks
-* otherwise return if neither the inode nor the vfsmount care about
-* this type of event.
+* We must clear the (user-visible) ignored mask on the first IN_MODIFY
+* event despite the 'mask' which is passed in here.  But we can safely
+* skip that step if we know there are no marks which need this action.
+*
+* We can also skip looking at the list of marks if we know that none
+* of the marks are interested in the events in our 'mask'.
 */
-   if (!(mask & FS_MODIFY) &&
-   !(test_mask & to_tell->i_fsnotify.mask) &&
-   !(mnt && test_mask & mnt->mnt_fsnotify.mask))
+   if ((mask & FS_MODIFY) && !some_mark_needs_ignore_clear(to_tell, mnt))
+   return 0;
+   else if (!some_mark_is_interested(test_mask, to_tell, mnt))
return 0;
+
/*
 * Optimization: srcu_read_lock() has a memory barrier which can
 * be expensive.  It protects walking the *_fsnotify_marks lists.
diff -puN fs/notify/mark.c~fsnotify-ignore-present fs/notify/mark.c
--- a/fs/notify/mark.c~fsnotify-ignore-present  2015-06-24 17:14:37.189226832 
-0700
+++ b/fs/notify/mark.c  2015-06-24 17:14:37.194227057 -0700
@@ -116,10 +116,14 @@ void fsnotify_recalc_mask(struct fsnotif
 {
u32 new_mask = 0;
struct fsnotify_mark *mark;
+   u32 has_ignore = 0;
 
-   hlist_for_each_entry(mark, >marks, obj_list)
+   hlist_for_each_entry(mark, >marks, obj_list) {
+   if (mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY)
+   has_ignore = 1;
new_mask |= mark->mask;
-
+   }
+   fsn->has_ignore = has_ignore;
fsn->mask = new_mask;
 }
 
diff -puN include/linux/fsnotify_head.h~fsnotify-ignore-present 
include/linux/fsnotify_head.h
--- a/include/linux/fsnotify_head.h~fsnotify-ignore-present 2015-06-24 
17:14:37.190226877 -0700
+++ b/include/linux/fsnotify_head.h 2015-06-24 17:14:37.193227012 -0700
@@ -11,6 +11,7 @@ struct fsnotify_head {
 #ifdef CONFIG_FSNOTIFY
__u32   mask; /* all events this object 

  1   2   3   4   5   6   7   8   9   10   >