date:20171224

[PATCH v2] x86/microcode/intel: Blacklist the specific BDW-EP for late loading

2017-12-24 Thread Jia Zhang

Instead of blacklisting all Broadwell processorsi for running a late
loading, only BDW-EP (signature 406f1) with the microcode version
less than 0x0b21 needs to be blacklisted.

This is documented in the the public documentation #334165 (See the
item BDF90 for details).

Signed-off-by: Jia Zhang 
---
 arch/x86/kernel/cpu/microcode/intel.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
b/arch/x86/kernel/cpu/microcode/intel.c
index 8ccdca6..f80b2dd 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -910,8 +910,15 @@ static bool is_blacklisted(unsigned int cpu)
 {
struct cpuinfo_x86 *c = _data(cpu);
 
-   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
-   pr_err_once("late loading on model 79 is disabled.\n");
+   /*
+* The Broadwell-EP processor with the microcode version less
+* then 0x0b21 may reault in system hang when running a late
+* loading. This behavior is documented in item BDF90, #334165
+* (Intel Xeon Processor E7-8800/4800 v4 Product Family).
+*/
+   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
+   c->x86_mask == 0x01 && intel_get_microcode_revision() < 
0x0b21U) {
+   pr_err_once("late loading on model 79 (sig 64f1) is 
disabled.\n");
return true;
}
 
-- 
1.8.3.1

[PATCH v2] x86/microcode/intel: Blacklist the specific BDW-EP for late loading

2017-12-24 Thread Jia Zhang

Instead of blacklisting all Broadwell processorsi for running a late
loading, only BDW-EP (signature 406f1) with the microcode version
less than 0x0b21 needs to be blacklisted.

This is documented in the the public documentation #334165 (See the
item BDF90 for details).

Signed-off-by: Jia Zhang 
---
 arch/x86/kernel/cpu/microcode/intel.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
b/arch/x86/kernel/cpu/microcode/intel.c
index 8ccdca6..f80b2dd 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -910,8 +910,15 @@ static bool is_blacklisted(unsigned int cpu)
 {
struct cpuinfo_x86 *c = _data(cpu);
 
-   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
-   pr_err_once("late loading on model 79 is disabled.\n");
+   /*
+* The Broadwell-EP processor with the microcode version less
+* then 0x0b21 may reault in system hang when running a late
+* loading. This behavior is documented in item BDF90, #334165
+* (Intel Xeon Processor E7-8800/4800 v4 Product Family).
+*/
+   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
+   c->x86_mask == 0x01 && intel_get_microcode_revision() < 
0x0b21U) {
+   pr_err_once("late loading on model 79 (sig 64f1) is 
disabled.\n");
return true;
}
 
-- 
1.8.3.1

Re: [PATCH] x86/microcode/intel: Blacklist the specific BDW-EP for late loading

2017-12-24 Thread Jia Zhang

Sorry I should remove UTF-8 characters in comment lines. Plz ignore this
patch.

Cheers,
Jia

在 2017/12/25 下午3:30, Jia Zhang 写道:
> Instead of blacklisting all Broadwell processorsi for running a late
> loading, only BDW-EP (signature 406f1) with the microcode version
> less than 0x0b21 needs to be blacklisted.
> 
> This is documented in the the public documentation #334165 (See the
> item BDF90 for details).
> 
> Signed-off-by: Jia Zhang 
> ---
>  arch/x86/kernel/cpu/microcode/intel.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
> b/arch/x86/kernel/cpu/microcode/intel.c
> index 8ccdca6..5495369 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -910,8 +910,15 @@ static bool is_blacklisted(unsigned int cpu)
>  {
>   struct cpuinfo_x86 *c = _data(cpu);
>  
> - if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
> - pr_err_once("late loading on model 79 is disabled.\n");
> + /*
> +  * The Broadwell-EP processor with the microcode version less
> +  * then 0x0b21 may reault in system hang when running a late
> +  * loading. This behavior is documented in item BDF90, #334165
> +  * (Intel® Xeon® Processor E7-8800/ 4800 v4 Product Family).
> +  */
> + if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
> + c->x86_mask == 0x01 && intel_get_microcode_revision() < 
> 0x0b21U) {
> + pr_err_once("late loading on model 79 (sig 64f1) is 
> disabled.\n");
>   return true;
>   }
>  
>

Re: [PATCH] x86/microcode/intel: Blacklist the specific BDW-EP for late loading

2017-12-24 Thread Jia Zhang

Sorry I should remove UTF-8 characters in comment lines. Plz ignore this
patch.

Cheers,
Jia

在 2017/12/25 下午3:30, Jia Zhang 写道:
> Instead of blacklisting all Broadwell processorsi for running a late
> loading, only BDW-EP (signature 406f1) with the microcode version
> less than 0x0b21 needs to be blacklisted.
> 
> This is documented in the the public documentation #334165 (See the
> item BDF90 for details).
> 
> Signed-off-by: Jia Zhang 
> ---
>  arch/x86/kernel/cpu/microcode/intel.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
> b/arch/x86/kernel/cpu/microcode/intel.c
> index 8ccdca6..5495369 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -910,8 +910,15 @@ static bool is_blacklisted(unsigned int cpu)
>  {
>   struct cpuinfo_x86 *c = _data(cpu);
>  
> - if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
> - pr_err_once("late loading on model 79 is disabled.\n");
> + /*
> +  * The Broadwell-EP processor with the microcode version less
> +  * then 0x0b21 may reault in system hang when running a late
> +  * loading. This behavior is documented in item BDF90, #334165
> +  * (Intel® Xeon® Processor E7-8800/ 4800 v4 Product Family).
> +  */
> + if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
> + c->x86_mask == 0x01 && intel_get_microcode_revision() < 
> 0x0b21U) {
> + pr_err_once("late loading on model 79 (sig 64f1) is 
> disabled.\n");
>   return true;
>   }
>  
>

[BUG ? ] Each pci bridge only supports hotplugging 16 numbers of virtio-blk/virtio-net devices

2017-12-24 Thread Hailiang Zhang


Hi,

We tried to hot add more than 16 numbers of virtio-blk devices to pci bridge, 
but found that only 16 of them are available in VM.

There are ‘no space’ error messages in dmesg:

[4.666106] pci :00:03.0: PCI bridge to [bus 01]

[4.666191] pci :00:03.0:   bridge window [io  0x7000-0x7fff]

[4.670044] pci :00:03.0:   bridge window [mem 0xfe80-0xfe9f]

[4.672650] pci :00:03.0:   bridge window [mem 0xfcc0-0xfcdf 
64bit pref]

[4.677876] pci :00:07.0: PCI bridge to [bus 02]

[4.677967] pci :00:07.0:   bridge window [io  0x6000-0x6fff]

[4.681816] pci :00:07.0:   bridge window [mem 0xfe60-0xfe7f]

[4.684422] pci :00:07.0:   bridge window [mem 0xfca0-0xfcbf 
64bit pref]

… …

[   85.779103] pci :02:17.0: [1af4:1001] type 00 class 0x01

[   85.779194] pci :02:17.0: reg 0x10: [io  0x-0x003f]

[   85.779235] pci :02:17.0: reg 0x14: [mem 0x-0x0fff]

[   85.779812] pci :02:17.0: BAR 1: assigned [mem 0xfe60f000-0xfe60]

[   85.779835] pci :02:17.0: BAR 0: assigned [io  0x6cc0-0x6cff]

[   85.779951] virtio-pci :02:17.0: enabling device ( -> 0003)

[   85.833435] virtio-pci :02:17.0: virtio_pci: leaving for legacy driver

[   85.846894] virtio-pci :02:17.0: irq 61 for MSI/MSI-X

[   85.846927] virtio-pci :02:17.0: irq 62 for MSI/MSI-X

[   86.013107] pci :02:18.0: [1af4:1001] type 00 class 0x01

[   86.013199] pci :02:18.0: reg 0x10: [io  0x-0x003f]

[   86.013241] pci :02:18.0: reg 0x14: [mem 0x-0x0fff]

[   86.013868] pci :02:18.0: BAR 1: assigned [mem 0xfe61-0xfe610fff]

[   86.013903] pci :02:18.0: BAR 0: no space for [io  size 0x0040]

[   86.013925] pci :02:18.0: BAR 0: failed to assign [io  size 0x0040]

[   86.014010] virtio-pci :02:18.0: enabling device ( -> 0002)

[   86.057575] virtio-pci :02:18.0: virtio_pci: leaving for legacy driver

[   86.088217] virtio-pci: probe of :02:18.0 failed with error -12

We went through the kernel codes which processing the hotplug pci devices, the 
call stack is:

acpi_hotplug_work_fn

–>enable_slot

->__pci_bus_assign_resources

->pci_bus_alloc_resource

->pci_bus_alloc_from_region

->allocate_resource

->find_resource

->pcibios_align_resource

The failure comes with pcibios_align_resource().

resource_size_t

pcibios_align_resource(void *data, const struct resource *res,

resource_size_t size, resource_size_t align)

{

struct pci_dev *dev = data;

resource_size_t start = res->start;

if (res->flags & IORESOURCE_IO) {

if (skip_isa_ioresource_align(dev))

return start;

if (start & 0x300)

start = (start + 0x3ff) & ~0x3ff; àhere.

With the above logic, Only the bellow IO addresses are available for virtio-blk:

[0x6000-0x603f], [0x6040-0x607f], [0x6080-0x60bf], [0x60c0-0x60ff]

[0x6400-0x643f], [0x6440-0x647f], [0x6480-0x64bf], [0x64c0-0x64ff]

[0x6800-0x683f], [0x6840-0x687f], [0x6880-0x68bf], [0x68c0-0x68ff]

[0x6c00-0x6c3f], [0x6c40-0x6c7f], [0x6c80-0x6cbf], [0x6cc0-0x6cff]

So the number is just 16.

We have noticed the comments above pcibios_align_resource():

+/*

+ * We need to avoid collisions with `mirrored' VGA ports

+ * and other strange ISA hardware, so we always want the

+ * addresses to be allocated in the 0x000-0x0ff region

+ * modulo 0x400.

+ *

But we still didn’t quite understand about this, does anyone know about this ?

Or could we skip this checking for standard pci devices ?

Thanks,

Hailiang

[BUG ? ] Each pci bridge only supports hotplugging 16 numbers of virtio-blk/virtio-net devices

2017-12-24 Thread Hailiang Zhang


Hi,

We tried to hot add more than 16 numbers of virtio-blk devices to pci bridge, 
but found that only 16 of them are available in VM.

There are ‘no space’ error messages in dmesg:

[4.666106] pci :00:03.0: PCI bridge to [bus 01]

[4.666191] pci :00:03.0:   bridge window [io  0x7000-0x7fff]

[4.670044] pci :00:03.0:   bridge window [mem 0xfe80-0xfe9f]

[4.672650] pci :00:03.0:   bridge window [mem 0xfcc0-0xfcdf 
64bit pref]

[4.677876] pci :00:07.0: PCI bridge to [bus 02]

[4.677967] pci :00:07.0:   bridge window [io  0x6000-0x6fff]

[4.681816] pci :00:07.0:   bridge window [mem 0xfe60-0xfe7f]

[4.684422] pci :00:07.0:   bridge window [mem 0xfca0-0xfcbf 
64bit pref]

… …

[   85.779103] pci :02:17.0: [1af4:1001] type 00 class 0x01

[   85.779194] pci :02:17.0: reg 0x10: [io  0x-0x003f]

[   85.779235] pci :02:17.0: reg 0x14: [mem 0x-0x0fff]

[   85.779812] pci :02:17.0: BAR 1: assigned [mem 0xfe60f000-0xfe60]

[   85.779835] pci :02:17.0: BAR 0: assigned [io  0x6cc0-0x6cff]

[   85.779951] virtio-pci :02:17.0: enabling device ( -> 0003)

[   85.833435] virtio-pci :02:17.0: virtio_pci: leaving for legacy driver

[   85.846894] virtio-pci :02:17.0: irq 61 for MSI/MSI-X

[   85.846927] virtio-pci :02:17.0: irq 62 for MSI/MSI-X

[   86.013107] pci :02:18.0: [1af4:1001] type 00 class 0x01

[   86.013199] pci :02:18.0: reg 0x10: [io  0x-0x003f]

[   86.013241] pci :02:18.0: reg 0x14: [mem 0x-0x0fff]

[   86.013868] pci :02:18.0: BAR 1: assigned [mem 0xfe61-0xfe610fff]

[   86.013903] pci :02:18.0: BAR 0: no space for [io  size 0x0040]

[   86.013925] pci :02:18.0: BAR 0: failed to assign [io  size 0x0040]

[   86.014010] virtio-pci :02:18.0: enabling device ( -> 0002)

[   86.057575] virtio-pci :02:18.0: virtio_pci: leaving for legacy driver

[   86.088217] virtio-pci: probe of :02:18.0 failed with error -12

We went through the kernel codes which processing the hotplug pci devices, the 
call stack is:

acpi_hotplug_work_fn

–>enable_slot

->__pci_bus_assign_resources

->pci_bus_alloc_resource

->pci_bus_alloc_from_region

->allocate_resource

->find_resource

->pcibios_align_resource

The failure comes with pcibios_align_resource().

resource_size_t

pcibios_align_resource(void *data, const struct resource *res,

resource_size_t size, resource_size_t align)

{

struct pci_dev *dev = data;

resource_size_t start = res->start;

if (res->flags & IORESOURCE_IO) {

if (skip_isa_ioresource_align(dev))

return start;

if (start & 0x300)

start = (start + 0x3ff) & ~0x3ff; àhere.

With the above logic, Only the bellow IO addresses are available for virtio-blk:

[0x6000-0x603f], [0x6040-0x607f], [0x6080-0x60bf], [0x60c0-0x60ff]

[0x6400-0x643f], [0x6440-0x647f], [0x6480-0x64bf], [0x64c0-0x64ff]

[0x6800-0x683f], [0x6840-0x687f], [0x6880-0x68bf], [0x68c0-0x68ff]

[0x6c00-0x6c3f], [0x6c40-0x6c7f], [0x6c80-0x6cbf], [0x6cc0-0x6cff]

So the number is just 16.

We have noticed the comments above pcibios_align_resource():

+/*

+ * We need to avoid collisions with `mirrored' VGA ports

+ * and other strange ISA hardware, so we always want the

+ * addresses to be allocated in the 0x000-0x0ff region

+ * modulo 0x400.

+ *

But we still didn’t quite understand about this, does anyone know about this ?

Or could we skip this checking for standard pci devices ?

Thanks,

Hailiang

[PATCH] x86/microcode/intel: Blacklist the specific BDW-EP for late loading

2017-12-24 Thread Jia Zhang

Instead of blacklisting all Broadwell processorsi for running a late
loading, only BDW-EP (signature 406f1) with the microcode version
less than 0x0b21 needs to be blacklisted.

This is documented in the the public documentation #334165 (See the
item BDF90 for details).

Signed-off-by: Jia Zhang 
---
 arch/x86/kernel/cpu/microcode/intel.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
b/arch/x86/kernel/cpu/microcode/intel.c
index 8ccdca6..5495369 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -910,8 +910,15 @@ static bool is_blacklisted(unsigned int cpu)
 {
struct cpuinfo_x86 *c = _data(cpu);
 
-   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
-   pr_err_once("late loading on model 79 is disabled.\n");
+   /*
+* The Broadwell-EP processor with the microcode version less
+* then 0x0b21 may reault in system hang when running a late
+* loading. This behavior is documented in item BDF90, #334165
+* (Intel® Xeon® Processor E7-8800/ 4800 v4 Product Family).
+*/
+   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
+   c->x86_mask == 0x01 && intel_get_microcode_revision() < 
0x0b21U) {
+   pr_err_once("late loading on model 79 (sig 64f1) is 
disabled.\n");
return true;
}
 
-- 
1.8.3.1

[PATCH] x86/microcode/intel: Blacklist the specific BDW-EP for late loading

2017-12-24 Thread Jia Zhang

Instead of blacklisting all Broadwell processorsi for running a late
loading, only BDW-EP (signature 406f1) with the microcode version
less than 0x0b21 needs to be blacklisted.

This is documented in the the public documentation #334165 (See the
item BDF90 for details).

Signed-off-by: Jia Zhang 
---
 arch/x86/kernel/cpu/microcode/intel.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
b/arch/x86/kernel/cpu/microcode/intel.c
index 8ccdca6..5495369 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -910,8 +910,15 @@ static bool is_blacklisted(unsigned int cpu)
 {
struct cpuinfo_x86 *c = _data(cpu);
 
-   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
-   pr_err_once("late loading on model 79 is disabled.\n");
+   /*
+* The Broadwell-EP processor with the microcode version less
+* then 0x0b21 may reault in system hang when running a late
+* loading. This behavior is documented in item BDF90, #334165
+* (Intel® Xeon® Processor E7-8800/ 4800 v4 Product Family).
+*/
+   if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X &&
+   c->x86_mask == 0x01 && intel_get_microcode_revision() < 
0x0b21U) {
+   pr_err_once("late loading on model 79 (sig 64f1) is 
disabled.\n");
return true;
}
 
-- 
1.8.3.1

Re: [PATCH] KVM: x86: ioapic: Clear IRR for rtc bit when rtc EOI gotten

2017-12-24 Thread Wanpeng Li

2017-12-14 20:23 GMT+08:00 Gonglei :
> We hit a bug in our test while run PCMark 10 in a windows 7 VM,
> The VM got stuck and the wallclock was hang after several minutes running
> PCMark 10 in it.
> It is quite easily to reproduce the bug with the upstream KVM and Qemu.
>
> We found that KVM can not inject any RTC irq to VM after it was hang, it 
> fails to
> Deliver irq in ioapic_set_irq() because RTC irq is still pending in 
> ioapic->irr.
>
> static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
>   int irq_level, bool line_status)
> {
> ...
>  if (!irq_level) {
>   ioapic->irr &= ~mask;
>   ret = 1;
>   goto out;
>  }
> ...
>  if ((edge && old_irr == ioapic->irr) ||
>  (!edge && entry.fields.remote_irr)) {
>   ret = 0;
>   goto out;
>  }
>
> According to RTC spec, after RTC injects a High level irq, OS will read CMOS's

I think it is falling edge and active low.

Regards,
Wanpeng Li

> register C to to clear the irq flag, and pull down the irq electric pin.
>
> For Qemu, we will emulate the reading operation in cmos_ioport_read(),
> but Guest OS will fire a write operation before to tell which register will 
> be read
> after this write, where we use s->cmos_index to record the following register 
> to read.
>
> But in our test, we found that there is a possible situation that Vcpu fails 
> to read
> RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading
> registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C,
> so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C,
> but before it tries to read register C, another vcpu1 is going to read 
> RTC_YEAR,
> it changes s->cmos_index to RTC_YEAR by a writing action.
> The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we 
> will miss
> calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never 
> inject RTC irq,
> and Windows VM will hang.
>
> Let's clear IRR of rtc when corresponding EOI is gotten to avoid the issue.
>
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Gonglei 
> ---
>   Thanks to Paolo provides a good solution. :)
>
>  arch/x86/kvm/ioapic.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 4e822ad..5022d63 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -160,6 +160,7 @@ static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct 
> kvm_vcpu *vcpu)
>  {
> if (test_and_clear_bit(vcpu->vcpu_id,
>ioapic->rtc_status.dest_map.map)) {
> +   ioapic->irr &= ~(1 << RTC_GSI);
> --ioapic->rtc_status.pending_eoi;
> rtc_status_pending_eoi_check_valid(ioapic);
> }
> --
> 1.8.3.1
>
>

Re: [PATCH] KVM: x86: ioapic: Clear IRR for rtc bit when rtc EOI gotten

2017-12-24 Thread Wanpeng Li

2017-12-14 20:23 GMT+08:00 Gonglei :
> We hit a bug in our test while run PCMark 10 in a windows 7 VM,
> The VM got stuck and the wallclock was hang after several minutes running
> PCMark 10 in it.
> It is quite easily to reproduce the bug with the upstream KVM and Qemu.
>
> We found that KVM can not inject any RTC irq to VM after it was hang, it 
> fails to
> Deliver irq in ioapic_set_irq() because RTC irq is still pending in 
> ioapic->irr.
>
> static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
>   int irq_level, bool line_status)
> {
> ...
>  if (!irq_level) {
>   ioapic->irr &= ~mask;
>   ret = 1;
>   goto out;
>  }
> ...
>  if ((edge && old_irr == ioapic->irr) ||
>  (!edge && entry.fields.remote_irr)) {
>   ret = 0;
>   goto out;
>  }
>
> According to RTC spec, after RTC injects a High level irq, OS will read CMOS's

I think it is falling edge and active low.

Regards,
Wanpeng Li

> register C to to clear the irq flag, and pull down the irq electric pin.
>
> For Qemu, we will emulate the reading operation in cmos_ioport_read(),
> but Guest OS will fire a write operation before to tell which register will 
> be read
> after this write, where we use s->cmos_index to record the following register 
> to read.
>
> But in our test, we found that there is a possible situation that Vcpu fails 
> to read
> RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading
> registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C,
> so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C,
> but before it tries to read register C, another vcpu1 is going to read 
> RTC_YEAR,
> it changes s->cmos_index to RTC_YEAR by a writing action.
> The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we 
> will miss
> calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never 
> inject RTC irq,
> and Windows VM will hang.
>
> Let's clear IRR of rtc when corresponding EOI is gotten to avoid the issue.
>
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Gonglei 
> ---
>   Thanks to Paolo provides a good solution. :)
>
>  arch/x86/kvm/ioapic.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 4e822ad..5022d63 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -160,6 +160,7 @@ static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct 
> kvm_vcpu *vcpu)
>  {
> if (test_and_clear_bit(vcpu->vcpu_id,
>ioapic->rtc_status.dest_map.map)) {
> +   ioapic->irr &= ~(1 << RTC_GSI);
> --ioapic->rtc_status.pending_eoi;
> rtc_status_pending_eoi_check_valid(ioapic);
> }
> --
> 1.8.3.1
>
>

Re: Php-fpm will crash when perf runs with call graph option

2017-12-24 Thread Wangnan (F)


Have you tried updating your kernel to a recent version?


On 2017/12/25 14:58, ufo19890607 wrote:

From: yuzhoujian 

I use perf to analyze the performance overhead for the server.
There are several dockers in the server. The php-fpm in the docker
will crash as long as the perf collects samples for all the cpus with call 
graph option(perf record -ag). Below is the stack trace
in the coredump.

#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
,
fastmap=0x46 , init_state=, init_state=) at regcomp.c:407
407   if (__wcrtomb (buf, towlower (cset->mbchars[i]), 
)
(gdb) bt
#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
,
fastmap=0x46 , init_state=, 
init_state=) at regcomp.c:407
#1  0x00831160 in virtual_file_ex (state=0x7fff9c1a4f70, path=, verify_path=0x0, use_realpath=1)
at /home/xiaoju/phpng/php-7.0.6/Zend/zend_virtual_cwd.c:1335
#2  0x007aacee in expand_filepath_with_mode (
filepath=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app",
real_path=0x7fff9c1a4fc0 "\360X\032\234\377\177", 
relative_to=, relative_to_len=0, realpath_mode=1)
at 
/home/xiaoju/phpng/php-7.0.6/main/fopen_wrappers.c:812
#3  0x007c1536 in _php_stream_fopen (
filename=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", mode=0xdbb1f1 
"rb",
opened_path=0x0, options=0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/plain_wrapper.c:970
#4  0x007bd084 in _php_stream_open_wrapper_ex (
path=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", mode=0xdbb1f1 
"rb", options=8,
opened_path=0x0, context=0x7f044d65f4c0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/streams.c:2060
#5  0x0071722b in zif_file_get_contents (execute_data=, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/standard/file.c:544
#6  0x0065387c in phar_file_get_contents (execute_data=0x7f044d615570, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/phar/func_interceptors.c:224))

I add some output info in the php source code, and found that virtual_file_ex 
functions's rbp value is really strange,etc 0x1, 0x31. I guess when the perf 
collects samples for all the cpus with -g option, it may destroy the php-fpm's 
stack. When the perf is running without -g option, the php-fpm is normal. Who 
have ever encountered similar problems?

BTW, OS in the server: Centos7.3  , Kernel version: 3.10.0-514.16.1.el7.x86_64. 
php-fpm version: 7.0.6
Processor info: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Php-fpm will crash when perf runs with call graph option

2017-12-24 Thread Wangnan (F)


Have you tried updating your kernel to a recent version?


On 2017/12/25 14:58, ufo19890607 wrote:

From: yuzhoujian 

I use perf to analyze the performance overhead for the server.
There are several dockers in the server. The php-fpm in the docker
will crash as long as the perf collects samples for all the cpus with call 
graph option(perf record -ag). Below is the stack trace
in the coredump.

#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
,
fastmap=0x46 , init_state=, init_state=) at regcomp.c:407
407   if (__wcrtomb (buf, towlower (cset->mbchars[i]), 
)
(gdb) bt
#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
,
fastmap=0x46 , init_state=, 
init_state=) at regcomp.c:407
#1  0x00831160 in virtual_file_ex (state=0x7fff9c1a4f70, path=, verify_path=0x0, use_realpath=1)
at /home/xiaoju/phpng/php-7.0.6/Zend/zend_virtual_cwd.c:1335
#2  0x007aacee in expand_filepath_with_mode (
filepath=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app",
real_path=0x7fff9c1a4fc0 "\360X\032\234\377\177", 
relative_to=, relative_to_len=0, realpath_mode=1)
at 
/home/xiaoju/phpng/php-7.0.6/main/fopen_wrappers.c:812
#3  0x007c1536 in _php_stream_fopen (
filename=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", mode=0xdbb1f1 
"rb",
opened_path=0x0, options=0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/plain_wrapper.c:970
#4  0x007bd084 in _php_stream_open_wrapper_ex (
path=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", mode=0xdbb1f1 
"rb", options=8,
opened_path=0x0, context=0x7f044d65f4c0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/streams.c:2060
#5  0x0071722b in zif_file_get_contents (execute_data=, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/standard/file.c:544
#6  0x0065387c in phar_file_get_contents (execute_data=0x7f044d615570, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/phar/func_interceptors.c:224))

I add some output info in the php source code, and found that virtual_file_ex 
functions's rbp value is really strange,etc 0x1, 0x31. I guess when the perf 
collects samples for all the cpus with -g option, it may destroy the php-fpm's 
stack. When the perf is running without -g option, the php-fpm is normal. Who 
have ever encountered similar problems?

BTW, OS in the server: Centos7.3  , Kernel version: 3.10.0-514.16.1.el7.x86_64. 
php-fpm version: 7.0.6
Processor info: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 00/11] FUSE mounts from non-init user namespaces

2017-12-24 Thread Eric W. Biederman

Dongsu Park  writes:

> This patchset v5 is based on work by Seth Forshee and Eric Biederman.
> The latest patchset was v4:
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1132206.html
>
> At the moment, filesystems backed by physical medium can only be mounted
> by real root in the initial user namespace. This restriction exists
> because if it's allowed for root user in non-init user namespaces to
> mount the filesystem, then it effectively allows the user to control the
> underlying source of the filesystem. In case of FUSE, the source would
> mean any underlying device.
>
> However, in many use cases such as containers, it's necessary to allow
> filesystems to be mounted from non-init user namespaces. Goal of this
> patchset is to allow FUSE filesystems to be mounted from non-init user
> namespaces. Support for other filesystems like ext4 are not in the
> scope of this patchset.
>
> Let me describe how to test mounting from non-init user namespaces. It's
> assumed that tests are done via sshfs, a userspace filesystem based on
> FUSE with ssh as backend. Testing system is Fedora 27.

In general I am for this work, and more bodies and more eyes on it is
generally better.

I will review this after the New Year, I am out for the holidays right
now.

Eric


>
> 
> $ sudo dnf install -y sshfs
> $ sudo mkdir -p /mnt/userns
>
> ### workaround to get the sshfs permission checks
> $ sudo chown -R $UID:$UID /etc/ssh/ssh_config.d /usr/share/crypto-policies
>
> $ unshare -U -r -m
> # sshfs root@localhost: /mnt/userns
>
> ### You can see sshfs being mounted from a non-init user namespace
> # mount | grep sshfs
> root@localhost: on /mnt/userns type fuse.sshfs
> (rw,nosuid,nodev,relatime,user_id=0,group_id=0)
>
> # touch /mnt/userns/test
> # ls -l /mnt/userns/test
> -rw-r--r-- 1 root root 0 Dec 11 19:01 /mnt/userns/test
> 
>
> Open another terminal, check the mountpoint from outside the namespace.
>
> 
> $ grep userns /proc/$(pidof sshfs)/mountinfo
> 131 102 0:35 / /mnt/userns rw,nosuid,nodev,relatime - fuse.sshfs
> root@localhost: rw,user_id=0,group_id=0
> 
>
> After all tests are done, you can unmount the filesystem
> inside the namespace.
>
> 
> # fusermount -u /mnt/userns
> 
>
> Changes since v4:
>  * Remove other parts like ext4 to keep the patchset minimal for FUSE
>  * Add and change commit messages
>  * Describe how to test non-init user namespaces
>
> TODO:
>  * Think through potential security implications. There are 2 patches
>being prepared for security issues. One is "ima: define a new policy
>option named force" by Mimi Zohar, which adds an option to specify
>that the results should not be cached:
>https://marc.info/?l=linux-integrity=151275680115856=2
>The other one is to basically prevent FUSE results from being cached,
>which is still in progress.
>
>  * Test IMA/LSMs. Details are written in
>
> https://github.com/kinvolk/fuse-userns-patches/blob/master/tests/TESTING_INTEGRITY.md
>
> Patches 1-2 deal with an additional flag of lookup_bdev() to check for
> additional inode permission.
>
> Patches 3-7 allow the superblock owner to change ownership of inodes, and
> deal with additional capability checks w.r.t user namespaces.
>
> Patches 8-10 allow FUSE filesystems to be mounted outside of the init
> user namespace.
>
> Patch 11 handles a corner case of non-root users in EVM.
>
> The patchset is also available in our github repo:
>   https://github.com/kinvolk/linux/tree/dongsu/fuse-userns-v5-1
>
>
> Eric W. Biederman (1):
>   fs: Allow superblock owner to change ownership of inodes
>
> Seth Forshee (10):
>   block_dev: Support checking inode permissions in lookup_bdev()
>   mtd: Check permissions towards mtd block device inode when mounting
>   fs: Don't remove suid for CAP_FSETID for userns root
>   fs: Allow superblock owner to access do_remount_sb()
>   capabilities: Allow privileged user in s_user_ns to set security.*
> xattrs
>   fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
>   fuse: Support fuse filesystems outside of init_user_ns
>   fuse: Restrict allow_other to the superblock's namespace or a
> descendant
>   fuse: Allow user namespace mounts
>   evm: Don't update hmacs in user ns mounts
>
>  drivers/md/bcache/super.c   |  2 +-
>  drivers/md/dm-table.c   |  2 +-
>  drivers/mtd/mtdsuper.c  |  6 +-
>  fs/attr.c   | 34 ++
>  fs/block_dev.c  | 13 ++---
>  fs/fuse/cuse.c  |  3 ++-
>  fs/fuse/dev.c   | 11 ---
>  fs/fuse/dir.c   | 16 
>  fs/fuse/fuse_i.h|  6 +-
>  fs/fuse/inode.c | 35 +--
>  fs/inode.c  |  6 --
>  fs/ioctl.c  |  4 ++--
>  fs/namespace.c

Re: [PATCH v5 00/11] FUSE mounts from non-init user namespaces

2017-12-24 Thread Eric W. Biederman

Dongsu Park  writes:

> This patchset v5 is based on work by Seth Forshee and Eric Biederman.
> The latest patchset was v4:
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1132206.html
>
> At the moment, filesystems backed by physical medium can only be mounted
> by real root in the initial user namespace. This restriction exists
> because if it's allowed for root user in non-init user namespaces to
> mount the filesystem, then it effectively allows the user to control the
> underlying source of the filesystem. In case of FUSE, the source would
> mean any underlying device.
>
> However, in many use cases such as containers, it's necessary to allow
> filesystems to be mounted from non-init user namespaces. Goal of this
> patchset is to allow FUSE filesystems to be mounted from non-init user
> namespaces. Support for other filesystems like ext4 are not in the
> scope of this patchset.
>
> Let me describe how to test mounting from non-init user namespaces. It's
> assumed that tests are done via sshfs, a userspace filesystem based on
> FUSE with ssh as backend. Testing system is Fedora 27.

In general I am for this work, and more bodies and more eyes on it is
generally better.

I will review this after the New Year, I am out for the holidays right
now.

Eric


>
> 
> $ sudo dnf install -y sshfs
> $ sudo mkdir -p /mnt/userns
>
> ### workaround to get the sshfs permission checks
> $ sudo chown -R $UID:$UID /etc/ssh/ssh_config.d /usr/share/crypto-policies
>
> $ unshare -U -r -m
> # sshfs root@localhost: /mnt/userns
>
> ### You can see sshfs being mounted from a non-init user namespace
> # mount | grep sshfs
> root@localhost: on /mnt/userns type fuse.sshfs
> (rw,nosuid,nodev,relatime,user_id=0,group_id=0)
>
> # touch /mnt/userns/test
> # ls -l /mnt/userns/test
> -rw-r--r-- 1 root root 0 Dec 11 19:01 /mnt/userns/test
> 
>
> Open another terminal, check the mountpoint from outside the namespace.
>
> 
> $ grep userns /proc/$(pidof sshfs)/mountinfo
> 131 102 0:35 / /mnt/userns rw,nosuid,nodev,relatime - fuse.sshfs
> root@localhost: rw,user_id=0,group_id=0
> 
>
> After all tests are done, you can unmount the filesystem
> inside the namespace.
>
> 
> # fusermount -u /mnt/userns
> 
>
> Changes since v4:
>  * Remove other parts like ext4 to keep the patchset minimal for FUSE
>  * Add and change commit messages
>  * Describe how to test non-init user namespaces
>
> TODO:
>  * Think through potential security implications. There are 2 patches
>being prepared for security issues. One is "ima: define a new policy
>option named force" by Mimi Zohar, which adds an option to specify
>that the results should not be cached:
>https://marc.info/?l=linux-integrity=151275680115856=2
>The other one is to basically prevent FUSE results from being cached,
>which is still in progress.
>
>  * Test IMA/LSMs. Details are written in
>
> https://github.com/kinvolk/fuse-userns-patches/blob/master/tests/TESTING_INTEGRITY.md
>
> Patches 1-2 deal with an additional flag of lookup_bdev() to check for
> additional inode permission.
>
> Patches 3-7 allow the superblock owner to change ownership of inodes, and
> deal with additional capability checks w.r.t user namespaces.
>
> Patches 8-10 allow FUSE filesystems to be mounted outside of the init
> user namespace.
>
> Patch 11 handles a corner case of non-root users in EVM.
>
> The patchset is also available in our github repo:
>   https://github.com/kinvolk/linux/tree/dongsu/fuse-userns-v5-1
>
>
> Eric W. Biederman (1):
>   fs: Allow superblock owner to change ownership of inodes
>
> Seth Forshee (10):
>   block_dev: Support checking inode permissions in lookup_bdev()
>   mtd: Check permissions towards mtd block device inode when mounting
>   fs: Don't remove suid for CAP_FSETID for userns root
>   fs: Allow superblock owner to access do_remount_sb()
>   capabilities: Allow privileged user in s_user_ns to set security.*
> xattrs
>   fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
>   fuse: Support fuse filesystems outside of init_user_ns
>   fuse: Restrict allow_other to the superblock's namespace or a
> descendant
>   fuse: Allow user namespace mounts
>   evm: Don't update hmacs in user ns mounts
>
>  drivers/md/bcache/super.c   |  2 +-
>  drivers/md/dm-table.c   |  2 +-
>  drivers/mtd/mtdsuper.c  |  6 +-
>  fs/attr.c   | 34 ++
>  fs/block_dev.c  | 13 ++---
>  fs/fuse/cuse.c  |  3 ++-
>  fs/fuse/dev.c   | 11 ---
>  fs/fuse/dir.c   | 16 
>  fs/fuse/fuse_i.h|  6 +-
>  fs/fuse/inode.c | 35 +--
>  fs/inode.c  |  6 --
>  fs/ioctl.c  |  4 ++--
>  fs/namespace.c  |  4

Php-fpm will crash when perf runs with call graph option

2017-12-24 Thread ufo19890607

From: yuzhoujian 

I use perf to analyze the performance overhead for the server. 
There are several dockers in the server. The php-fpm in the docker
will crash as long as the perf collects samples for all the cpus with call 
graph option(perf record -ag). Below is the stack trace
in the coredump.

#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , 
init_state=, init_state=) at regcomp.c:407
407   if (__wcrtomb (buf, towlower (cset->mbchars[i]), 
)
(gdb) bt
#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , init_state=, init_state=) at regcomp.c:407
#1  0x00831160 in virtual_file_ex (state=0x7fff9c1a4f70, 
path=, verify_path=0x0, use_realpath=1)
at /home/xiaoju/phpng/php-7.0.6/Zend/zend_virtual_cwd.c:1335
#2  0x007aacee in expand_filepath_with_mode (
filepath=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
real_path=0x7fff9c1a4fc0 "\360X\032\234\377\177", 
relative_to=, relative_to_len=0, realpath_mode=1)
at 
/home/xiaoju/phpng/php-7.0.6/main/fopen_wrappers.c:812
#3  0x007c1536 in _php_stream_fopen (
filename=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", 
opened_path=0x0, options=0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/plain_wrapper.c:970
#4  0x007bd084 in _php_stream_open_wrapper_ex (
path=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", options=8, 
opened_path=0x0, context=0x7f044d65f4c0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/streams.c:2060
#5  0x0071722b in zif_file_get_contents (execute_data=, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/standard/file.c:544
#6  0x0065387c in phar_file_get_contents (execute_data=0x7f044d615570, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/phar/func_interceptors.c:224))

I add some output info in the php source code, and found that virtual_file_ex 
functions's rbp value is really strange,etc 0x1, 0x31. I guess when the perf 
collects samples for all the cpus with -g option, it may destroy the php-fpm's 
stack. When the perf is running without -g option, the php-fpm is normal. Who 
have ever encountered similar problems?

BTW, OS in the server: Centos7.3  , Kernel version: 3.10.0-514.16.1.el7.x86_64. 
php-fpm version: 7.0.6 
Processor info: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Php-fpm will crash when perf runs with call graph option

2017-12-24 Thread ufo19890607

From: yuzhoujian 

I use perf to analyze the performance overhead for the server. 
There are several dockers in the server. The php-fpm in the docker
will crash as long as the perf collects samples for all the cpus with call 
graph option(perf record -ag). Below is the stack trace
in the coredump.

#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , 
init_state=, init_state=) at regcomp.c:407
407   if (__wcrtomb (buf, towlower (cset->mbchars[i]), 
)
(gdb) bt
#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , init_state=, init_state=) at regcomp.c:407
#1  0x00831160 in virtual_file_ex (state=0x7fff9c1a4f70, 
path=, verify_path=0x0, use_realpath=1)
at /home/xiaoju/phpng/php-7.0.6/Zend/zend_virtual_cwd.c:1335
#2  0x007aacee in expand_filepath_with_mode (
filepath=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
real_path=0x7fff9c1a4fc0 "\360X\032\234\377\177", 
relative_to=, relative_to_len=0, realpath_mode=1)
at 
/home/xiaoju/phpng/php-7.0.6/main/fopen_wrappers.c:812
#3  0x007c1536 in _php_stream_fopen (
filename=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", 
opened_path=0x0, options=0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/plain_wrapper.c:970
#4  0x007bd084 in _php_stream_open_wrapper_ex (
path=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", options=8, 
opened_path=0x0, context=0x7f044d65f4c0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/streams.c:2060
#5  0x0071722b in zif_file_get_contents (execute_data=, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/standard/file.c:544
#6  0x0065387c in phar_file_get_contents (execute_data=0x7f044d615570, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/phar/func_interceptors.c:224))

I add some output info in the php source code, and found that virtual_file_ex 
functions's rbp value is really strange,etc 0x1, 0x31. I guess when the perf 
collects samples for all the cpus with -g option, it may destroy the php-fpm's 
stack. When the perf is running without -g option, the php-fpm is normal. Who 
have ever encountered similar problems?

BTW, OS in the server: Centos7.3  , Kernel version: 3.10.0-514.16.1.el7.x86_64. 
php-fpm version: 7.0.6 
Processor info: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Php-fpm will crash when perf runs with call graph option

2017-12-24 Thread ufo19890607

From: yuzhoujian 

#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , 
init_state=, init_state=) at regcomp.c:407
407   if (__wcrtomb (buf, towlower (cset->mbchars[i]), 
)
(gdb) bt
#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , init_state=, init_state=) at regcomp.c:407
#1  0x00831160 in virtual_file_ex (state=0x7fff9c1a4f70, 
path=, verify_path=0x0, use_realpath=1)
at /home/xiaoju/phpng/php-7.0.6/Zend/zend_virtual_cwd.c:1335
#2  0x007aacee in expand_filepath_with_mode (
filepath=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
real_path=0x7fff9c1a4fc0 "\360X\032\234\377\177", 
relative_to=, relative_to_len=0, realpath_mode=1)
at 
/home/xiaoju/phpng/php-7.0.6/main/fopen_wrappers.c:812
#3  0x007c1536 in _php_stream_fopen (
filename=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", 
opened_path=0x0, options=0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/plain_wrapper.c:970
#4  0x007bd084 in _php_stream_open_wrapper_ex (
path=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", options=8, 
opened_path=0x0, context=0x7f044d65f4c0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/streams.c:2060
#5  0x0071722b in zif_file_get_contents (execute_data=, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/standard/file.c:544
#6  0x0065387c in phar_file_get_contents (execute_data=0x7f044d615570, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/phar/func_interceptors.c:224))

I add some output info in the php source code, and found that virtual_file_ex 
functions's rbp value is really strange,etc 0x1, 0x31. I guess when the perf 
collects samples for all the cpus with -g option, it may destroy the php-fpm's 
stack. When the perf is running without -g option, the php-fpm is normal. Who 
have ever encountered similar problems?

BTW, OS in the server: Centos7.3  , Kernel version: 3.10.0-514.16.1.el7.x86_64. 
php-fpm version: 7.0.6 
Processor info: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Php-fpm will crash when perf runs with call graph option

2017-12-24 Thread ufo19890607

From: yuzhoujian 

#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , 
init_state=, init_state=) at regcomp.c:407
407   if (__wcrtomb (buf, towlower (cset->mbchars[i]), 
)
(gdb) bt
#0  0x7f044ff447bd in re_compile_fastmap_iter (bufp=0x7f044ff447bd 
, 
fastmap=0x46 , init_state=, init_state=) at regcomp.c:407
#1  0x00831160 in virtual_file_ex (state=0x7fff9c1a4f70, 
path=, verify_path=0x0, use_realpath=1)
at /home/xiaoju/phpng/php-7.0.6/Zend/zend_virtual_cwd.c:1335
#2  0x007aacee in expand_filepath_with_mode (
filepath=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
real_path=0x7fff9c1a4fc0 "\360X\032\234\377\177", 
relative_to=, relative_to_len=0, realpath_mode=1)
at 
/home/xiaoju/phpng/php-7.0.6/main/fopen_wrappers.c:812
#3  0x007c1536 in _php_stream_fopen (
filename=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", 
opened_path=0x0, options=0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/plain_wrapper.c:970
#4  0x007bd084 in _php_stream_open_wrapper_ex (
path=0x7f044d6020d8 
"/home/xiaoju/ep/as/store//toggles/beatles_api_discovery_is_open_by_app", 
mode=0xdbb1f1 "rb", options=8, 
opened_path=0x0, context=0x7f044d65f4c0) at 
/home/xiaoju/phpng/php-7.0.6/main/streams/streams.c:2060
#5  0x0071722b in zif_file_get_contents (execute_data=, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/standard/file.c:544
#6  0x0065387c in phar_file_get_contents (execute_data=0x7f044d615570, 
return_value=0x7f044d615540)
at 
/home/xiaoju/phpng/php-7.0.6/ext/phar/func_interceptors.c:224))

I add some output info in the php source code, and found that virtual_file_ex 
functions's rbp value is really strange,etc 0x1, 0x31. I guess when the perf 
collects samples for all the cpus with -g option, it may destroy the php-fpm's 
stack. When the perf is running without -g option, the php-fpm is normal. Who 
have ever encountered similar problems?

BTW, OS in the server: Centos7.3  , Kernel version: 3.10.0-514.16.1.el7.x86_64. 
php-fpm version: 7.0.6 
Processor info: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

RE: [PATCH] vfio: mdev: make a couple of functions and structure vfio_mdev_driver static

2017-12-24 Thread Liu, Yi L

> Sent: Friday, December 22, 2017 7:12 AM
> To: kwankh...@nvidia.com; alex.william...@redhat.com
> Cc: k...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH] vfio: mdev: make a couple of functions and structure
> vfio_mdev_driver static
> 
> The functions vfio_mdev_probe, vfio_mdev_remove and the structure
> vfio_mdev_driver are only used in this file, so make them static.
> 
> Clean up sparse warnings:
> drivers/vfio/mdev/vfio_mdev.c:114:5: warning: no previous prototype for
> 'vfio_mdev_probe' [-Wmissing-prototypes]
> drivers/vfio/mdev/vfio_mdev.c:121:6: warning: no previous prototype for
> 'vfio_mdev_remove' [-Wmissing-prototypes]
> 
> Signed-off-by: Xiongwei Song 
> ---
>  drivers/vfio/mdev/vfio_mdev.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c 
> index
> fa848a701b8b..d230620fe02d 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -111,19 +111,19 @@ static const struct vfio_device_ops vfio_mdev_dev_ops =
> {
>   .mmap   = vfio_mdev_mmap,
>  };
> 
> -int vfio_mdev_probe(struct device *dev)
> +static int vfio_mdev_probe(struct device *dev)
>  {
>   struct mdev_device *mdev = to_mdev_device(dev);
> 
>   return vfio_add_group_dev(dev, _mdev_dev_ops, mdev);  }
> 
> -void vfio_mdev_remove(struct device *dev)
> +static void vfio_mdev_remove(struct device *dev)
>  {
>   vfio_del_group_dev(dev);
>  }
> 
> -struct mdev_driver vfio_mdev_driver = {
> +static struct mdev_driver vfio_mdev_driver = {
>   .name   = "vfio_mdev",
>   .probe  = vfio_mdev_probe,
>   .remove = vfio_mdev_remove,
> --
> 2.15.1

Reviewed-by: Liu, Yi L

RE: [PATCH] vfio: mdev: make a couple of functions and structure vfio_mdev_driver static

2017-12-24 Thread Liu, Yi L

> Sent: Friday, December 22, 2017 7:12 AM
> To: kwankh...@nvidia.com; alex.william...@redhat.com
> Cc: k...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH] vfio: mdev: make a couple of functions and structure
> vfio_mdev_driver static
> 
> The functions vfio_mdev_probe, vfio_mdev_remove and the structure
> vfio_mdev_driver are only used in this file, so make them static.
> 
> Clean up sparse warnings:
> drivers/vfio/mdev/vfio_mdev.c:114:5: warning: no previous prototype for
> 'vfio_mdev_probe' [-Wmissing-prototypes]
> drivers/vfio/mdev/vfio_mdev.c:121:6: warning: no previous prototype for
> 'vfio_mdev_remove' [-Wmissing-prototypes]
> 
> Signed-off-by: Xiongwei Song 
> ---
>  drivers/vfio/mdev/vfio_mdev.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c 
> index
> fa848a701b8b..d230620fe02d 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -111,19 +111,19 @@ static const struct vfio_device_ops vfio_mdev_dev_ops =
> {
>   .mmap   = vfio_mdev_mmap,
>  };
> 
> -int vfio_mdev_probe(struct device *dev)
> +static int vfio_mdev_probe(struct device *dev)
>  {
>   struct mdev_device *mdev = to_mdev_device(dev);
> 
>   return vfio_add_group_dev(dev, _mdev_dev_ops, mdev);  }
> 
> -void vfio_mdev_remove(struct device *dev)
> +static void vfio_mdev_remove(struct device *dev)
>  {
>   vfio_del_group_dev(dev);
>  }
> 
> -struct mdev_driver vfio_mdev_driver = {
> +static struct mdev_driver vfio_mdev_driver = {
>   .name   = "vfio_mdev",
>   .probe  = vfio_mdev_probe,
>   .remove = vfio_mdev_remove,
> --
> 2.15.1

Reviewed-by: Liu, Yi L

[PATCH v7 1/2] regmap: Add one flag to indicate if a hwlock should be used

2017-12-24 Thread Baolin Wang

Since the hwlock id 0 is valid for hardware spinlock core, but now id 0
is treated as one invalid value for regmap. Thus we should add one extra
flag for regmap config to indicate if a hardware spinlock should be used,
then id 0 can be valid for regmap to request.

Signed-off-by: Baolin Wang 
---
 - Add this new patch in V7.
---
 drivers/base/regmap/regmap.c |2 +-
 include/linux/regmap.h   |2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index f25ab18..d23a5c9 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -671,7 +671,7 @@ struct regmap *__regmap_init(struct device *dev,
map->lock = config->lock;
map->unlock = config->unlock;
map->lock_arg = config->lock_arg;
-   } else if (config->hwlock_id) {
+   } else if (config->use_hwlock) {
map->hwlock = hwspin_lock_request_specific(config->hwlock_id);
if (!map->hwlock) {
ret = -ENXIO;
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 15eddc1..c78e005 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -317,6 +317,7 @@ struct regmap_access_table {
  *
  * @ranges: Array of configuration entries for virtual address ranges.
  * @num_ranges: Number of range configuration entries.
+ * @use_hwlock: Indicate if a hardware spinlock should be used.
  * @hwlock_id: Specify the hardware spinlock id.
  * @hwlock_mode: The hardware spinlock mode, should be HWLOCK_IRQSTATE,
  *  HWLOCK_IRQ or 0.
@@ -365,6 +366,7 @@ struct regmap_config {
const struct regmap_range_cfg *ranges;
unsigned int num_ranges;
 
+   bool use_hwlock;
unsigned int hwlock_id;
unsigned int hwlock_mode;
 };
-- 
1.7.9.5

[PATCH v7 1/2] regmap: Add one flag to indicate if a hwlock should be used

2017-12-24 Thread Baolin Wang

Since the hwlock id 0 is valid for hardware spinlock core, but now id 0
is treated as one invalid value for regmap. Thus we should add one extra
flag for regmap config to indicate if a hardware spinlock should be used,
then id 0 can be valid for regmap to request.

Signed-off-by: Baolin Wang 
---
 - Add this new patch in V7.
---
 drivers/base/regmap/regmap.c |2 +-
 include/linux/regmap.h   |2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index f25ab18..d23a5c9 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -671,7 +671,7 @@ struct regmap *__regmap_init(struct device *dev,
map->lock = config->lock;
map->unlock = config->unlock;
map->lock_arg = config->lock_arg;
-   } else if (config->hwlock_id) {
+   } else if (config->use_hwlock) {
map->hwlock = hwspin_lock_request_specific(config->hwlock_id);
if (!map->hwlock) {
ret = -ENXIO;
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 15eddc1..c78e005 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -317,6 +317,7 @@ struct regmap_access_table {
  *
  * @ranges: Array of configuration entries for virtual address ranges.
  * @num_ranges: Number of range configuration entries.
+ * @use_hwlock: Indicate if a hardware spinlock should be used.
  * @hwlock_id: Specify the hardware spinlock id.
  * @hwlock_mode: The hardware spinlock mode, should be HWLOCK_IRQSTATE,
  *  HWLOCK_IRQ or 0.
@@ -365,6 +366,7 @@ struct regmap_config {
const struct regmap_range_cfg *ranges;
unsigned int num_ranges;
 
+   bool use_hwlock;
unsigned int hwlock_id;
unsigned int hwlock_mode;
 };
-- 
1.7.9.5

[PATCH v7 2/2] mfd: syscon: Add hardware spinlock support

2017-12-24 Thread Baolin Wang

Some system control registers need hardware spinlock to synchronize
between the multiple subsystems, so we should add hardware spinlock
support for syscon.

Signed-off-by: Baolin Wang 
Acked-by: Rob Herring 
---
Changes since v6:
 - Treat hwlock id 0 as valid for regmap.

Changes since v5:
 - Fix the case that hwspinlock is not enabled.

Changes since v4:
 - Add one exapmle to show how to add hwlock.
 - Fix the coding style issue.

Changes since v3:
 - Add error handling for of_hwspin_lock_get_id()

Changes since v2:
 - Add acked tag from Rob.

Changes since v1:
 - Remove timeout configuration.
 - Modify the binding file to add hwlocks.
---
 Documentation/devicetree/bindings/mfd/syscon.txt |8 
 drivers/mfd/syscon.c |   19 +++
 2 files changed, 27 insertions(+)

diff --git a/Documentation/devicetree/bindings/mfd/syscon.txt 
b/Documentation/devicetree/bindings/mfd/syscon.txt
index 8b92d45..25d9e9c 100644
--- a/Documentation/devicetree/bindings/mfd/syscon.txt
+++ b/Documentation/devicetree/bindings/mfd/syscon.txt
@@ -16,9 +16,17 @@ Required properties:
 Optional property:
 - reg-io-width: the size (in bytes) of the IO accesses that should be
   performed on the device.
+- hwlocks: reference to a phandle of a hardware spinlock provider node.
 
 Examples:
 gpr: iomuxc-gpr@20e {
compatible = "fsl,imx6q-iomuxc-gpr", "syscon";
reg = <0x020e 0x38>;
+   hwlocks = < 1>;
+};
+
+hwlock1: hwspinlock@4050 {
+   ...
+   reg = <0x4050 0x1000>;
+   #hwlock-cells = <1>;
 };
diff --git a/drivers/mfd/syscon.c b/drivers/mfd/syscon.c
index b93fe4c..7eaa40b 100644
--- a/drivers/mfd/syscon.c
+++ b/drivers/mfd/syscon.c
@@ -13,6 +13,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -87,6 +88,24 @@ static struct syscon *of_syscon_register(struct device_node 
*np)
if (ret)
reg_io_width = 4;
 
+   ret = of_hwspin_lock_get_id(np, 0);
+   if (ret > 0 || (IS_ENABLED(CONFIG_HWSPINLOCK) && ret == 0)) {
+   syscon_config.use_hwlock = true;
+   syscon_config.hwlock_id = ret;
+   syscon_config.hwlock_mode = HWLOCK_IRQSTATE;
+   } else if (ret < 0) {
+   switch (ret) {
+   case -ENOENT:
+   /* Ignore missing hwlock, it's optional. */
+   break;
+   default:
+   pr_err("Failed to retrieve valid hwlock: %d\n", ret);
+   /* fall-through */
+   case -EPROBE_DEFER:
+   goto err_regmap;
+   }
+   }
+
syscon_config.reg_stride = reg_io_width;
syscon_config.val_bits = reg_io_width * 8;
syscon_config.max_register = resource_size() - reg_io_width;
-- 
1.7.9.5

[PATCH v7 2/2] mfd: syscon: Add hardware spinlock support

2017-12-24 Thread Baolin Wang

Some system control registers need hardware spinlock to synchronize
between the multiple subsystems, so we should add hardware spinlock
support for syscon.

Signed-off-by: Baolin Wang 
Acked-by: Rob Herring 
---
Changes since v6:
 - Treat hwlock id 0 as valid for regmap.

Changes since v5:
 - Fix the case that hwspinlock is not enabled.

Changes since v4:
 - Add one exapmle to show how to add hwlock.
 - Fix the coding style issue.

Changes since v3:
 - Add error handling for of_hwspin_lock_get_id()

Changes since v2:
 - Add acked tag from Rob.

Changes since v1:
 - Remove timeout configuration.
 - Modify the binding file to add hwlocks.
---
 Documentation/devicetree/bindings/mfd/syscon.txt |8 
 drivers/mfd/syscon.c |   19 +++
 2 files changed, 27 insertions(+)

diff --git a/Documentation/devicetree/bindings/mfd/syscon.txt 
b/Documentation/devicetree/bindings/mfd/syscon.txt
index 8b92d45..25d9e9c 100644
--- a/Documentation/devicetree/bindings/mfd/syscon.txt
+++ b/Documentation/devicetree/bindings/mfd/syscon.txt
@@ -16,9 +16,17 @@ Required properties:
 Optional property:
 - reg-io-width: the size (in bytes) of the IO accesses that should be
   performed on the device.
+- hwlocks: reference to a phandle of a hardware spinlock provider node.
 
 Examples:
 gpr: iomuxc-gpr@20e {
compatible = "fsl,imx6q-iomuxc-gpr", "syscon";
reg = <0x020e 0x38>;
+   hwlocks = < 1>;
+};
+
+hwlock1: hwspinlock@4050 {
+   ...
+   reg = <0x4050 0x1000>;
+   #hwlock-cells = <1>;
 };
diff --git a/drivers/mfd/syscon.c b/drivers/mfd/syscon.c
index b93fe4c..7eaa40b 100644
--- a/drivers/mfd/syscon.c
+++ b/drivers/mfd/syscon.c
@@ -13,6 +13,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -87,6 +88,24 @@ static struct syscon *of_syscon_register(struct device_node 
*np)
if (ret)
reg_io_width = 4;
 
+   ret = of_hwspin_lock_get_id(np, 0);
+   if (ret > 0 || (IS_ENABLED(CONFIG_HWSPINLOCK) && ret == 0)) {
+   syscon_config.use_hwlock = true;
+   syscon_config.hwlock_id = ret;
+   syscon_config.hwlock_mode = HWLOCK_IRQSTATE;
+   } else if (ret < 0) {
+   switch (ret) {
+   case -ENOENT:
+   /* Ignore missing hwlock, it's optional. */
+   break;
+   default:
+   pr_err("Failed to retrieve valid hwlock: %d\n", ret);
+   /* fall-through */
+   case -EPROBE_DEFER:
+   goto err_regmap;
+   }
+   }
+
syscon_config.reg_stride = reg_io_width;
syscon_config.val_bits = reg_io_width * 8;
syscon_config.max_register = resource_size() - reg_io_width;
-- 
1.7.9.5

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-24 Thread Yunlong Song

What if the application starts atomic write but forgets to commit, e.g. 
bugs in application or the application

is a malicious software itself?

On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/f2fs.h  |  8 
  fs/f2fs/gc.c| 34 ++
  fs/f2fs/super.c |  3 +++
  3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
  };
  
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
  #define F2FS_FEATURE_ENCRYPT  0x0001
  #define F2FS_FEATURE_BLKZONED 0x0002
  #define F2FS_FEATURE_ATOMIC_WRITE 0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
  
  	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
  
  	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
  
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
  
  		cost = get_gc_cost(sbi, segno, );
  
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
  
  	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
  
  	if (sync)

ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
  }
  
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
  
  	for (i = 0; i < NR_INODE_TYPE; i++) {

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-24 Thread Yunlong Song

What if the application starts atomic write but forgets to commit, e.g. 
bugs in application or the application

is a malicious software itself?

On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/f2fs.h  |  8 
  fs/f2fs/gc.c| 34 ++
  fs/f2fs/super.c |  3 +++
  3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
  };
  
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
  #define F2FS_FEATURE_ENCRYPT  0x0001
  #define F2FS_FEATURE_BLKZONED 0x0002
  #define F2FS_FEATURE_ATOMIC_WRITE 0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
  
  	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
  
  	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
  
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
  
  		cost = get_gc_cost(sbi, segno, );
  
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
  
  	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
  
  	if (sync)

ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
  }
  
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
  
  	for (i = 0; i < NR_INODE_TYPE; i++) {

INIT_LIST_HEAD(>inode_list[i]);



.



--

Re: [PATCH] f2fs: return error during fill_super

2017-12-24 Thread Chao Yu

On 2017/12/21 4:39, Jaegeuk Kim wrote:
> Let's avoid BUG_ON during fill_super, when on-disk was totall corrupted.
> 
> Signed-off-by: Jaegeuk Kim 

Reviewed-by: Chao Yu 

Thanks,

Re: [PATCH] f2fs: return error during fill_super

2017-12-24 Thread Chao Yu

On 2017/12/21 4:39, Jaegeuk Kim wrote:
> Let's avoid BUG_ON during fill_super, when on-disk was totall corrupted.
> 
> Signed-off-by: Jaegeuk Kim 

Reviewed-by: Chao Yu 

Thanks,

Re: [PATCH] vfio: mdev: make a couple of functions and structure vfio_mdev_driver static

2017-12-24 Thread Quan Xu




On 2017/12/22 07:12, Xiongwei Song wrote:

The functions vfio_mdev_probe, vfio_mdev_remove and the structure
vfio_mdev_driver are only used in this file, so make them static.

Clean up sparse warnings:
drivers/vfio/mdev/vfio_mdev.c:114:5: warning: no previous prototype
for 'vfio_mdev_probe' [-Wmissing-prototypes]
drivers/vfio/mdev/vfio_mdev.c:121:6: warning: no previous prototype
for 'vfio_mdev_remove' [-Wmissing-prototypes]

Signed-off-by: Xiongwei Song 
---
  drivers/vfio/mdev/vfio_mdev.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index fa848a701b8b..d230620fe02d 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -111,19 +111,19 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
.mmap   = vfio_mdev_mmap,
  };
  
-int vfio_mdev_probe(struct device *dev)

+static int vfio_mdev_probe(struct device *dev)
  {
struct mdev_device *mdev = to_mdev_device(dev);
  
  	return vfio_add_group_dev(dev, _mdev_dev_ops, mdev);

  }
  
-void vfio_mdev_remove(struct device *dev)

+static void vfio_mdev_remove(struct device *dev)
  {
vfio_del_group_dev(dev);
  }
  
-struct mdev_driver vfio_mdev_driver = {

+static struct mdev_driver vfio_mdev_driver = {
.name   = "vfio_mdev",
.probe  = vfio_mdev_probe,
.remove = vfio_mdev_remove,

Reviewed-by: Quan Xu

Re: [PATCH] vfio: mdev: make a couple of functions and structure vfio_mdev_driver static

2017-12-24 Thread Quan Xu




On 2017/12/22 07:12, Xiongwei Song wrote:

The functions vfio_mdev_probe, vfio_mdev_remove and the structure
vfio_mdev_driver are only used in this file, so make them static.

Clean up sparse warnings:
drivers/vfio/mdev/vfio_mdev.c:114:5: warning: no previous prototype
for 'vfio_mdev_probe' [-Wmissing-prototypes]
drivers/vfio/mdev/vfio_mdev.c:121:6: warning: no previous prototype
for 'vfio_mdev_remove' [-Wmissing-prototypes]

Signed-off-by: Xiongwei Song 
---
  drivers/vfio/mdev/vfio_mdev.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index fa848a701b8b..d230620fe02d 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -111,19 +111,19 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
.mmap   = vfio_mdev_mmap,
  };
  
-int vfio_mdev_probe(struct device *dev)

+static int vfio_mdev_probe(struct device *dev)
  {
struct mdev_device *mdev = to_mdev_device(dev);
  
  	return vfio_add_group_dev(dev, _mdev_dev_ops, mdev);

  }
  
-void vfio_mdev_remove(struct device *dev)

+static void vfio_mdev_remove(struct device *dev)
  {
vfio_del_group_dev(dev);
  }
  
-struct mdev_driver vfio_mdev_driver = {

+static struct mdev_driver vfio_mdev_driver = {
.name   = "vfio_mdev",
.probe  = vfio_mdev_probe,
.remove = vfio_mdev_remove,

Reviewed-by: Quan Xu

[PATCH] Staging: vt6656: Fix unnecessary parantheses

2017-12-24 Thread Sumit Pundir

This patch fixes a few coding style issues as noted by checkpatch.pl
related to unnecessary parantheses.

This patch fixes the following checkpatch.pl warnings:

WARNING: Unnecessary parentheses around 'priv->eeprom[EEP_OFS_MAJOR_VER] == 0x1'
WARNING: Unnecessary parentheses around 'priv->eeprom[EEP_OFS_MINOR_VER] >= 0x4'

Signed-off-by: Sumit Pundir 
---
 drivers/staging/vt6656/main_usb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/vt6656/main_usb.c 
b/drivers/staging/vt6656/main_usb.c
index 1b51b83..c15ae72 100644
--- a/drivers/staging/vt6656/main_usb.c
+++ b/drivers/staging/vt6656/main_usb.c
@@ -266,8 +266,8 @@ static int vnt_init_registers(struct vnt_private *priv)
 
/* load vt3266 calibration parameters in EEPROM */
if (priv->rf_type == RF_VT3226D0) {
-   if ((priv->eeprom[EEP_OFS_MAJOR_VER] == 0x1) &&
-   (priv->eeprom[EEP_OFS_MINOR_VER] >= 0x4)) {
+   if (priv->eeprom[EEP_OFS_MAJOR_VER] == 0x1 &&
+   priv->eeprom[EEP_OFS_MINOR_VER] >= 0x4) {
calib_tx_iq = priv->eeprom[EEP_OFS_CALIB_TX_IQ];
calib_tx_dc = priv->eeprom[EEP_OFS_CALIB_TX_DC];
calib_rx_iq = priv->eeprom[EEP_OFS_CALIB_RX_IQ];
-- 
2.7.4

[PATCH] Staging: vt6656: Fix unnecessary parantheses

2017-12-24 Thread Sumit Pundir

This patch fixes a few coding style issues as noted by checkpatch.pl
related to unnecessary parantheses.

This patch fixes the following checkpatch.pl warnings:

WARNING: Unnecessary parentheses around 'priv->eeprom[EEP_OFS_MAJOR_VER] == 0x1'
WARNING: Unnecessary parentheses around 'priv->eeprom[EEP_OFS_MINOR_VER] >= 0x4'

Signed-off-by: Sumit Pundir 
---
 drivers/staging/vt6656/main_usb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/vt6656/main_usb.c 
b/drivers/staging/vt6656/main_usb.c
index 1b51b83..c15ae72 100644
--- a/drivers/staging/vt6656/main_usb.c
+++ b/drivers/staging/vt6656/main_usb.c
@@ -266,8 +266,8 @@ static int vnt_init_registers(struct vnt_private *priv)
 
/* load vt3266 calibration parameters in EEPROM */
if (priv->rf_type == RF_VT3226D0) {
-   if ((priv->eeprom[EEP_OFS_MAJOR_VER] == 0x1) &&
-   (priv->eeprom[EEP_OFS_MINOR_VER] >= 0x4)) {
+   if (priv->eeprom[EEP_OFS_MAJOR_VER] == 0x1 &&
+   priv->eeprom[EEP_OFS_MINOR_VER] >= 0x4) {
calib_tx_iq = priv->eeprom[EEP_OFS_CALIB_TX_IQ];
calib_tx_dc = priv->eeprom[EEP_OFS_CALIB_TX_DC];
calib_rx_iq = priv->eeprom[EEP_OFS_CALIB_RX_IQ];
-- 
2.7.4

[PATCH] Staging: vt6656: Fix unnecessary 'out of memory' message

2017-12-24 Thread Sumit Pundir

This patch fixes one of the warnings as noted by checkpatch.pl related
to unnecessary 'out of memory' message.

This patch fixes the following checkpatch.pl error:

WARNING: Possible unnecessary 'out of memory' message

Signed-off-by: Sumit Pundir 
---
 drivers/staging/vt6656/main_usb.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/staging/vt6656/main_usb.c 
b/drivers/staging/vt6656/main_usb.c
index 1b51b83..ccafcc2 100644
--- a/drivers/staging/vt6656/main_usb.c
+++ b/drivers/staging/vt6656/main_usb.c
@@ -427,11 +427,8 @@ static bool vnt_alloc_bufs(struct vnt_private *priv)
 
for (ii = 0; ii < priv->num_rcb; ii++) {
priv->rcb[ii] = kzalloc(sizeof(*priv->rcb[ii]), GFP_KERNEL);
-   if (!priv->rcb[ii]) {
-   dev_err(>usb->dev,
-   "failed to allocate rcb no %d\n", ii);
+   if (!priv->rcb[ii])
goto free_rx_tx;
-   }
 
rcb = priv->rcb[ii];
 
-- 
2.7.4

[PATCH] Staging: vt6656: Fix unnecessary 'out of memory' message

2017-12-24 Thread Sumit Pundir

This patch fixes one of the warnings as noted by checkpatch.pl related
to unnecessary 'out of memory' message.

This patch fixes the following checkpatch.pl error:

WARNING: Possible unnecessary 'out of memory' message

Signed-off-by: Sumit Pundir 
---
 drivers/staging/vt6656/main_usb.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/staging/vt6656/main_usb.c 
b/drivers/staging/vt6656/main_usb.c
index 1b51b83..ccafcc2 100644
--- a/drivers/staging/vt6656/main_usb.c
+++ b/drivers/staging/vt6656/main_usb.c
@@ -427,11 +427,8 @@ static bool vnt_alloc_bufs(struct vnt_private *priv)
 
for (ii = 0; ii < priv->num_rcb; ii++) {
priv->rcb[ii] = kzalloc(sizeof(*priv->rcb[ii]), GFP_KERNEL);
-   if (!priv->rcb[ii]) {
-   dev_err(>usb->dev,
-   "failed to allocate rcb no %d\n", ii);
+   if (!priv->rcb[ii])
goto free_rx_tx;
-   }
 
rcb = priv->rcb[ii];
 
-- 
2.7.4

For Your Urgent Attention, Sir/Madam,

2017-12-24 Thread James Jacobs

-- 
For Your Urgent Attention, Sir/Madam,

Having reviewed all the obstacles and problems surrounding the
transfer of your Funds and your inability to meet up with some charges
levied against you due to the past transfer options, We the Board of
Directors, United Bank For Africa (UBA Bank) had ordered our Foreign
Payment Remittance Unit to issue you a CORPORATE ATM CARD and we have
also registered it with FedEx Courier Company to delivery to you.
Meanwhile we have paid for delivering and insurance charges. The only
money you will pay them is their security keeping fee which they
stated that we should not pay for because they do not know the
duration it going to last in their Office.So i want you to contact
them urgently now to avoid increase of their keeping fee.

Contact, ( fedexcfour.revtomfrank11...@gmail.com  ), FedEx Office With
your delivery information such as:

1. Your full name:==
2. Your home address:===
3. Your phone:==
4. Your country:
5. Your occupation:=
6. Your age:

Required fee- $110 only for security keeping fee.

Like I stated earlier, the delivery and insurance charges has been
paid but we did not paid their office keeping fees because of their
refusal.They refused and the reason was that they do not know when you
are going to contact them and demur-rage might have increased by then.
They told us that their keeping fee is $110.

Here is the week Identification Package Number 5071. Thus, as soon as
you received your parcel do let us know okay.
Please Contact them in order to avoid extra charges.

Yours In Service
Mr James Jacob.
+234-9022201282

For Your Urgent Attention, Sir/Madam,

2017-12-24 Thread James Jacobs

-- 
For Your Urgent Attention, Sir/Madam,

Having reviewed all the obstacles and problems surrounding the
transfer of your Funds and your inability to meet up with some charges
levied against you due to the past transfer options, We the Board of
Directors, United Bank For Africa (UBA Bank) had ordered our Foreign
Payment Remittance Unit to issue you a CORPORATE ATM CARD and we have
also registered it with FedEx Courier Company to delivery to you.
Meanwhile we have paid for delivering and insurance charges. The only
money you will pay them is their security keeping fee which they
stated that we should not pay for because they do not know the
duration it going to last in their Office.So i want you to contact
them urgently now to avoid increase of their keeping fee.

Contact, ( fedexcfour.revtomfrank11...@gmail.com  ), FedEx Office With
your delivery information such as:

1. Your full name:==
2. Your home address:===
3. Your phone:==
4. Your country:
5. Your occupation:=
6. Your age:

Required fee- $110 only for security keeping fee.

Like I stated earlier, the delivery and insurance charges has been
paid but we did not paid their office keeping fees because of their
refusal.They refused and the reason was that they do not know when you
are going to contact them and demur-rage might have increased by then.
They told us that their keeping fee is $110.

Here is the week Identification Package Number 5071. Thus, as soon as
you received your parcel do let us know okay.
Please Contact them in order to avoid extra charges.

Yours In Service
Mr James Jacob.
+234-9022201282

Re: [PATCH] perf report: Fix a no annotate browser displayed issue

2017-12-24 Thread Jin, Yao


Hi,

Any comments for this bug fix?

Thanks
Jin Yao

On 12/18/2017 9:26 PM, Jin Yao wrote:

When enabling '-b' option in perf record, for example,

perf record -b ...
perf report

and then browsing the annotate browser from perf report, it would
be failed (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten
by hist_iter__branch_callback() in builtin-report.c. But this function
doesn't do something like mapping symbols and sources. So next,
do_annotate() will return directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (
refer to hist_iter__report_callback).

Signed-off-by: Jin Yao 
---
  tools/perf/builtin-report.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index eb9ce63..0bd0aef 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -162,12 +162,25 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct hist_entry *he = iter->he;
struct report *rep = arg;
struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   hist__account_cycles(sample->branch_stack, al, sample,
+rep->nonany_branch_mode);
  
  	bi = he->branch_info;

+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
branch_type_count(>brtype_stat, >flags,
  bi->from.addr, bi->to.addr);
  
-	return 0;

+out:
+   return err;
  }
  
  static int process_sample_event(struct perf_tool *tool,

Re: [PATCH] perf report: Fix a no annotate browser displayed issue

2017-12-24 Thread Jin, Yao


Hi,

Any comments for this bug fix?

Thanks
Jin Yao

On 12/18/2017 9:26 PM, Jin Yao wrote:

When enabling '-b' option in perf record, for example,

perf record -b ...
perf report

and then browsing the annotate browser from perf report, it would
be failed (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten
by hist_iter__branch_callback() in builtin-report.c. But this function
doesn't do something like mapping symbols and sources. So next,
do_annotate() will return directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (
refer to hist_iter__report_callback).

Signed-off-by: Jin Yao 
---
  tools/perf/builtin-report.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index eb9ce63..0bd0aef 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -162,12 +162,25 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct hist_entry *he = iter->he;
struct report *rep = arg;
struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   hist__account_cycles(sample->branch_stack, al, sample,
+rep->nonany_branch_mode);
  
  	bi = he->branch_info;

+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
branch_type_count(>brtype_stat, >flags,
  bi->from.addr, bi->to.addr);
  
-	return 0;

+out:
+   return err;
  }
  
  static int process_sample_event(struct perf_tool *tool,

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-24 Thread Chao Yu

On 2017/12/23 21:09, Yunlong Song wrote:
> For some corner case, f2fs_gc selects one target victim but cannot free
> that victim segment due to some reason (e.g. the segment has some blocks
> of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,

> segment may probably be selected over and over, and then f2fs_gc will
> go to dead loop. This patch identifies the dead-loop segment, and skips
> it in __get_victim next time.
> 
> Signed-off-by: Yunlong Song 
> ---
>  fs/f2fs/f2fs.h  |  8 
>  fs/f2fs/gc.c| 34 ++
>  fs/f2fs/super.c |  3 +++
>  3 files changed, 45 insertions(+)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index ca6b0c9..b75851b 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -115,6 +115,13 @@ struct f2fs_mount_info {
>   unsigned intopt;
>  };
>  
> +struct gc_loop_info {
> + int count;
> + unsigned int segno;
> + unsigned long *segmap;
> +};
> +#define GC_LOOP_MAX 10
> +
>  #define F2FS_FEATURE_ENCRYPT 0x0001
>  #define F2FS_FEATURE_BLKZONED0x0002
>  #define F2FS_FEATURE_ATOMIC_WRITE0x0004
> @@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
>  
>   /* threshold for converting bg victims for fg */
>   u64 fggc_threshold;
> + struct gc_loop_info gc_loop;
>  
>   /* maximum # of trials to find a victim segment for SSR and GC */
>   unsigned int max_victim_search;
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 5d5bba4..4ee9e1b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
> *sbi)
>   if (no_fggc_candidate(sbi, secno))
>   continue;
>  
> + if (sbi->gc_loop.segmap &&
> + test_bit(GET_SEG_FROM_SEC(sbi, secno), 
> sbi->gc_loop.segmap))
> + continue;
> +
>   clear_bit(secno, dirty_i->victim_secmap);
>   return GET_SEG_FROM_SEC(sbi, secno);
>   }
> @@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
>   if (gc_type == FG_GC && p.alloc_mode == LFS &&
>   no_fggc_candidate(sbi, secno))
>   goto next;
> + if (gc_type == FG_GC && p.alloc_mode == LFS &&
> + sbi->gc_loop.segmap && test_bit(segno, 
> sbi->gc_loop.segmap))
> + goto next;
>  
>   cost = get_gc_cost(sbi, segno, );
>  
> @@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>   seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
>   if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
>   sec_freed++;
> + else if (gc_type == FG_GC && seg_freed == 0) {
> + if (!sbi->gc_loop.segmap) {
> + sbi->gc_loop.segmap =
> + kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
> GFP_KERNEL);
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
> + }
> + if (segno == sbi->gc_loop.segno) {
> + if (sbi->gc_loop.count > GC_LOOP_MAX) {
> + f2fs_bug_on(sbi, 1);
> + set_bit(segno, sbi->gc_loop.segmap);
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
> + }
> + else
> + sbi->gc_loop.count++;
> + } else {
> + sbi->gc_loop.segno = segno;
> + sbi->gc_loop.count = 0;
> + }
> + }
>   total_freed += seg_freed;
>  
>   if (gc_type == FG_GC)
> @@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  
>   if (sync)
>   ret = sec_freed ? 0 : -EAGAIN;
> + if (sbi->gc_loop.segmap) {
> + kvfree(sbi->gc_loop.segmap);
> + sbi->gc_loop.segmap = NULL;
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
> + }
>   return ret;
>  }
>  
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 031cb26..76f0b72 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
> *data, int silent)
>   sbi->last_valid_block_count = sbi->total_valid_block_count;
>   sbi->reserved_blocks = 0;
>   sbi->current_reserved_blocks = 0;
> + sbi->gc_loop.segmap = NULL;
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
>  
>   for (i = 0; i < NR_INODE_TYPE; i++) {
>   INIT_LIST_HEAD(>inode_list[i]);
>

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-24 Thread Chao Yu

On 2017/12/23 21:09, Yunlong Song wrote:
> For some corner case, f2fs_gc selects one target victim but cannot free
> that victim segment due to some reason (e.g. the segment has some blocks
> of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,

> segment may probably be selected over and over, and then f2fs_gc will
> go to dead loop. This patch identifies the dead-loop segment, and skips
> it in __get_victim next time.
> 
> Signed-off-by: Yunlong Song 
> ---
>  fs/f2fs/f2fs.h  |  8 
>  fs/f2fs/gc.c| 34 ++
>  fs/f2fs/super.c |  3 +++
>  3 files changed, 45 insertions(+)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index ca6b0c9..b75851b 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -115,6 +115,13 @@ struct f2fs_mount_info {
>   unsigned intopt;
>  };
>  
> +struct gc_loop_info {
> + int count;
> + unsigned int segno;
> + unsigned long *segmap;
> +};
> +#define GC_LOOP_MAX 10
> +
>  #define F2FS_FEATURE_ENCRYPT 0x0001
>  #define F2FS_FEATURE_BLKZONED0x0002
>  #define F2FS_FEATURE_ATOMIC_WRITE0x0004
> @@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
>  
>   /* threshold for converting bg victims for fg */
>   u64 fggc_threshold;
> + struct gc_loop_info gc_loop;
>  
>   /* maximum # of trials to find a victim segment for SSR and GC */
>   unsigned int max_victim_search;
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 5d5bba4..4ee9e1b 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
> *sbi)
>   if (no_fggc_candidate(sbi, secno))
>   continue;
>  
> + if (sbi->gc_loop.segmap &&
> + test_bit(GET_SEG_FROM_SEC(sbi, secno), 
> sbi->gc_loop.segmap))
> + continue;
> +
>   clear_bit(secno, dirty_i->victim_secmap);
>   return GET_SEG_FROM_SEC(sbi, secno);
>   }
> @@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
>   if (gc_type == FG_GC && p.alloc_mode == LFS &&
>   no_fggc_candidate(sbi, secno))
>   goto next;
> + if (gc_type == FG_GC && p.alloc_mode == LFS &&
> + sbi->gc_loop.segmap && test_bit(segno, 
> sbi->gc_loop.segmap))
> + goto next;
>  
>   cost = get_gc_cost(sbi, segno, );
>  
> @@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>   seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
>   if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
>   sec_freed++;
> + else if (gc_type == FG_GC && seg_freed == 0) {
> + if (!sbi->gc_loop.segmap) {
> + sbi->gc_loop.segmap =
> + kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
> GFP_KERNEL);
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
> + }
> + if (segno == sbi->gc_loop.segno) {
> + if (sbi->gc_loop.count > GC_LOOP_MAX) {
> + f2fs_bug_on(sbi, 1);
> + set_bit(segno, sbi->gc_loop.segmap);
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
> + }
> + else
> + sbi->gc_loop.count++;
> + } else {
> + sbi->gc_loop.segno = segno;
> + sbi->gc_loop.count = 0;
> + }
> + }
>   total_freed += seg_freed;
>  
>   if (gc_type == FG_GC)
> @@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
>  
>   if (sync)
>   ret = sec_freed ? 0 : -EAGAIN;
> + if (sbi->gc_loop.segmap) {
> + kvfree(sbi->gc_loop.segmap);
> + sbi->gc_loop.segmap = NULL;
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
> + }
>   return ret;
>  }
>  
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 031cb26..76f0b72 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
> *data, int silent)
>   sbi->last_valid_block_count = sbi->total_valid_block_count;
>   sbi->reserved_blocks = 0;
>   sbi->current_reserved_blocks = 0;
> + sbi->gc_loop.segmap = NULL;
> + sbi->gc_loop.count = 0;
> + sbi->gc_loop.segno = NULL_SEGNO;
>  
>   for (i = 0; i < NR_INODE_TYPE; i++) {
>   INIT_LIST_HEAD(>inode_list[i]);
>

[PATCH] usb: dwc3: gadget: decrease the queued_requests in removal

2017-12-24 Thread Lipengcheng

In removal requests, it is necessary to make the corresponding trb
disable state (HWO = 1) and dep->queued_requests a corresponding reduction.
It is better to use a alone funtion to disable trb (HWO = 0).

Signed-off-by: Pengcheng Li 
---
drivers/usb/dwc3/gadget.c | 30 ++
1 file changed, 30 insertions(+)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 1e6c42e..273b51d 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -707,6 +707,36 @@ static void dwc3_remove_requests(struct dwc3 *dwc, struct 
dwc3_ep *dep)
    while (!list_empty(>started_list)) {
    req = next_request(>started_list);

+   if (req->trb) {
+   if (req->num_pending_sgs) {
+   struct dwc3_trb *trb;
+   int i = 0;
+
+   for (i = 0; i < req->num_pending_sgs; i++) {
+   trb = req->trb + i;
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+   }
+
+   if (req->unaligned || req->zero) {
+   trb = req->trb + req->num_pending_sgs + 1;
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+   }
+   } else {
+   struct dwc3_trb *trb = req->trb;
+
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+
+   if (req->unaligned || req->zero) {
+   trb = req->trb + 1;
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+   }
+   }
+   }
+   dep->queued_requests--;
    dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
    }

--
2.7.4

[PATCH] usb: dwc3: gadget: decrease the queued_requests in removal

2017-12-24 Thread Lipengcheng

In removal requests, it is necessary to make the corresponding trb
disable state (HWO = 1) and dep->queued_requests a corresponding reduction.
It is better to use a alone funtion to disable trb (HWO = 0).

Signed-off-by: Pengcheng Li 
---
drivers/usb/dwc3/gadget.c | 30 ++
1 file changed, 30 insertions(+)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 1e6c42e..273b51d 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -707,6 +707,36 @@ static void dwc3_remove_requests(struct dwc3 *dwc, struct 
dwc3_ep *dep)
    while (!list_empty(>started_list)) {
    req = next_request(>started_list);

+   if (req->trb) {
+   if (req->num_pending_sgs) {
+   struct dwc3_trb *trb;
+   int i = 0;
+
+   for (i = 0; i < req->num_pending_sgs; i++) {
+   trb = req->trb + i;
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+   }
+
+   if (req->unaligned || req->zero) {
+   trb = req->trb + req->num_pending_sgs + 1;
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+   }
+   } else {
+   struct dwc3_trb *trb = req->trb;
+
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+
+   if (req->unaligned || req->zero) {
+   trb = req->trb + 1;
+   trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+   dwc3_ep_inc_deq(dep);
+   }
+   }
+   }
+   dep->queued_requests--;
    dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
    }

--
2.7.4

[PATCH v2 07/12] drm/qxl: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/qxl/qxl_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
index ab48238..a86eaf9 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -393,7 +393,6 @@ static struct ttm_bo_driver qxl_bo_driver = {
.verify_access = _verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
.move_notify = _bo_move_notify,
 };
 
-- 
2.7.4

[PATCH v2 08/12] drm/radeon: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 6ada64d..8595c76 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -844,7 +844,6 @@ static struct ttm_bo_driver radeon_bo_driver = {
.fault_reserve_notify = _bo_fault_reserve_notify,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int radeon_ttm_init(struct radeon_device *rdev)
-- 
2.7.4

[PATCH v2 07/12] drm/qxl: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/qxl/qxl_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
index ab48238..a86eaf9 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -393,7 +393,6 @@ static struct ttm_bo_driver qxl_bo_driver = {
.verify_access = _verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
.move_notify = _bo_move_notify,
 };
 
-- 
2.7.4

[PATCH v2 08/12] drm/radeon: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 6ada64d..8595c76 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -844,7 +844,6 @@ static struct ttm_bo_driver radeon_bo_driver = {
.fault_reserve_notify = _bo_fault_reserve_notify,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int radeon_ttm_init(struct radeon_device *rdev)
-- 
2.7.4

[PATCH v2 01/12] drm/ttm: add ttm_bo_io_mem_pfn to check io_mem_pfn

2017-12-24 Thread Tan Xiaojun

The io_mem_pfn field was added in commit ea642c3216cb ("drm/ttm: add
io_mem_pfn callback") and is called unconditionally. However, not all
drivers were updated to set it.

Use the ttm_bo_default_io_mem_pfn function if a driver did not set its
own. And add new function ttm_bo_io_mem_pfn() as wrapper.

Signed-off-by: Michal Srb 
Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index c8ebb75..292d157 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -92,6 +92,17 @@ static int ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
return ret;
 }
 
+static unsigned long ttm_bo_io_mem_pfn(struct ttm_buffer_object *bo,
+  unsigned long page_offset)
+{
+   struct ttm_bo_device *bdev = bo->bdev;
+
+   if (bdev->driver->io_mem_pfn)
+   return bdev->driver->io_mem_pfn(bo, page_offset);
+
+   return ttm_bo_default_io_mem_pfn(bo, page_offset);
+}
+
 static int ttm_bo_vm_fault(struct vm_fault *vmf)
 {
struct vm_area_struct *vma = vmf->vma;
@@ -234,7 +245,7 @@ static int ttm_bo_vm_fault(struct vm_fault *vmf)
if (bo->mem.bus.is_iomem) {
/* Iomem should not be marked encrypted */
cvma.vm_page_prot = pgprot_decrypted(cvma.vm_page_prot);
-   pfn = bdev->driver->io_mem_pfn(bo, page_offset);
+   pfn = ttm_bo_io_mem_pfn(bo, page_offset);
} else {
page = ttm->pages[page_offset];
if (unlikely(!page && i == 0)) {
-- 
2.7.4

[PATCH v2 01/12] drm/ttm: add ttm_bo_io_mem_pfn to check io_mem_pfn

2017-12-24 Thread Tan Xiaojun

The io_mem_pfn field was added in commit ea642c3216cb ("drm/ttm: add
io_mem_pfn callback") and is called unconditionally. However, not all
drivers were updated to set it.

Use the ttm_bo_default_io_mem_pfn function if a driver did not set its
own. And add new function ttm_bo_io_mem_pfn() as wrapper.

Signed-off-by: Michal Srb 
Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index c8ebb75..292d157 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -92,6 +92,17 @@ static int ttm_bo_vm_fault_idle(struct ttm_buffer_object *bo,
return ret;
 }
 
+static unsigned long ttm_bo_io_mem_pfn(struct ttm_buffer_object *bo,
+  unsigned long page_offset)
+{
+   struct ttm_bo_device *bdev = bo->bdev;
+
+   if (bdev->driver->io_mem_pfn)
+   return bdev->driver->io_mem_pfn(bo, page_offset);
+
+   return ttm_bo_default_io_mem_pfn(bo, page_offset);
+}
+
 static int ttm_bo_vm_fault(struct vm_fault *vmf)
 {
struct vm_area_struct *vma = vmf->vma;
@@ -234,7 +245,7 @@ static int ttm_bo_vm_fault(struct vm_fault *vmf)
if (bo->mem.bus.is_iomem) {
/* Iomem should not be marked encrypted */
cvma.vm_page_prot = pgprot_decrypted(cvma.vm_page_prot);
-   pfn = bdev->driver->io_mem_pfn(bo, page_offset);
+   pfn = ttm_bo_io_mem_pfn(bo, page_offset);
} else {
page = ttm->pages[page_offset];
if (unlikely(!page && i == 0)) {
-- 
2.7.4

[PATCH v2 09/12] drm/virtio: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/virtio/virtgpu_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_ttm.c 
b/drivers/gpu/drm/virtio/virtgpu_ttm.c
index cd389c5..4a12434 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ttm.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ttm.c
@@ -431,7 +431,6 @@ static struct ttm_bo_driver virtio_gpu_bo_driver = {
.verify_access = _gpu_verify_access,
.io_mem_reserve = _gpu_ttm_io_mem_reserve,
.io_mem_free = _gpu_ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
.move_notify = _gpu_bo_move_notify,
.swap_notify = _gpu_bo_swap_notify,
 };
-- 
2.7.4

[PATCH v2 09/12] drm/virtio: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/virtio/virtgpu_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_ttm.c 
b/drivers/gpu/drm/virtio/virtgpu_ttm.c
index cd389c5..4a12434 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ttm.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ttm.c
@@ -431,7 +431,6 @@ static struct ttm_bo_driver virtio_gpu_bo_driver = {
.verify_access = _gpu_verify_access,
.io_mem_reserve = _gpu_ttm_io_mem_reserve,
.io_mem_free = _gpu_ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
.move_notify = _gpu_bo_move_notify,
.swap_notify = _gpu_bo_swap_notify,
 };
-- 
2.7.4

[PATCH v2 12/12] drm/ttm: remove ttm_bo_default_io_mem_pfn

2017-12-24 Thread Tan Xiaojun

No one will use this function except ttm_bo_io_mem_pfn() now, so move
the calculation of ttm_bo_default_io_mem_pfn() into ttm_bo_io_mem_pfn()
and do some cleanup.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 11 ++-
 include/drm/ttm/ttm_bo_api.h| 11 ---
 2 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 292d157..6edc19f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -100,7 +100,8 @@ static unsigned long ttm_bo_io_mem_pfn(struct 
ttm_buffer_object *bo,
if (bdev->driver->io_mem_pfn)
return bdev->driver->io_mem_pfn(bo, page_offset);
 
-   return ttm_bo_default_io_mem_pfn(bo, page_offset);
+   return ((bo->mem.bus.base + bo->mem.bus.offset) >> PAGE_SHIFT)
+   + page_offset;
 }
 
 static int ttm_bo_vm_fault(struct vm_fault *vmf)
@@ -415,14 +416,6 @@ static struct ttm_buffer_object *ttm_bo_vm_lookup(struct 
ttm_bo_device *bdev,
return bo;
 }
 
-unsigned long ttm_bo_default_io_mem_pfn(struct ttm_buffer_object *bo,
-   unsigned long page_offset)
-{
-   return ((bo->mem.bus.base + bo->mem.bus.offset) >> PAGE_SHIFT)
-   + page_offset;
-}
-EXPORT_SYMBOL(ttm_bo_default_io_mem_pfn);
-
 int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
struct ttm_bo_device *bdev)
 {
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index fa07be1..0b1ce05 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -711,17 +711,6 @@ extern int ttm_fbdev_mmap(struct vm_area_struct *vma,
  struct ttm_buffer_object *bo);
 
 /**
- * ttm_bo_default_iomem_pfn - get a pfn for a page offset
- *
- * @bo: the BO we need to look up the pfn for
- * @page_offset: offset inside the BO to look up.
- *
- * Calculate the PFN for iomem based mappings during page fault
- */
-unsigned long ttm_bo_default_io_mem_pfn(struct ttm_buffer_object *bo,
-   unsigned long page_offset);
-
-/**
  * ttm_bo_mmap - mmap out of the ttm device address space.
  *
  * @filp:  filp as input from the mmap method.
-- 
2.7.4

[PATCH v2 11/12] staging: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/staging/vboxvideo/vbox_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/vboxvideo/vbox_ttm.c 
b/drivers/staging/vboxvideo/vbox_ttm.c
index 4eb410a..4da1723 100644
--- a/drivers/staging/vboxvideo/vbox_ttm.c
+++ b/drivers/staging/vboxvideo/vbox_ttm.c
@@ -241,7 +241,6 @@ static struct ttm_bo_driver vbox_bo_driver = {
.verify_access = vbox_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int vbox_mm_init(struct vbox_private *vbox)
-- 
2.7.4

[PATCH v2 12/12] drm/ttm: remove ttm_bo_default_io_mem_pfn

2017-12-24 Thread Tan Xiaojun

No one will use this function except ttm_bo_io_mem_pfn() now, so move
the calculation of ttm_bo_default_io_mem_pfn() into ttm_bo_io_mem_pfn()
and do some cleanup.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 11 ++-
 include/drm/ttm/ttm_bo_api.h| 11 ---
 2 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 292d157..6edc19f 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -100,7 +100,8 @@ static unsigned long ttm_bo_io_mem_pfn(struct 
ttm_buffer_object *bo,
if (bdev->driver->io_mem_pfn)
return bdev->driver->io_mem_pfn(bo, page_offset);
 
-   return ttm_bo_default_io_mem_pfn(bo, page_offset);
+   return ((bo->mem.bus.base + bo->mem.bus.offset) >> PAGE_SHIFT)
+   + page_offset;
 }
 
 static int ttm_bo_vm_fault(struct vm_fault *vmf)
@@ -415,14 +416,6 @@ static struct ttm_buffer_object *ttm_bo_vm_lookup(struct 
ttm_bo_device *bdev,
return bo;
 }
 
-unsigned long ttm_bo_default_io_mem_pfn(struct ttm_buffer_object *bo,
-   unsigned long page_offset)
-{
-   return ((bo->mem.bus.base + bo->mem.bus.offset) >> PAGE_SHIFT)
-   + page_offset;
-}
-EXPORT_SYMBOL(ttm_bo_default_io_mem_pfn);
-
 int ttm_bo_mmap(struct file *filp, struct vm_area_struct *vma,
struct ttm_bo_device *bdev)
 {
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index fa07be1..0b1ce05 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -711,17 +711,6 @@ extern int ttm_fbdev_mmap(struct vm_area_struct *vma,
  struct ttm_buffer_object *bo);
 
 /**
- * ttm_bo_default_iomem_pfn - get a pfn for a page offset
- *
- * @bo: the BO we need to look up the pfn for
- * @page_offset: offset inside the BO to look up.
- *
- * Calculate the PFN for iomem based mappings during page fault
- */
-unsigned long ttm_bo_default_io_mem_pfn(struct ttm_buffer_object *bo,
-   unsigned long page_offset);
-
-/**
  * ttm_bo_mmap - mmap out of the ttm device address space.
  *
  * @filp:  filp as input from the mmap method.
-- 
2.7.4

[PATCH v2 11/12] staging: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/staging/vboxvideo/vbox_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/vboxvideo/vbox_ttm.c 
b/drivers/staging/vboxvideo/vbox_ttm.c
index 4eb410a..4da1723 100644
--- a/drivers/staging/vboxvideo/vbox_ttm.c
+++ b/drivers/staging/vboxvideo/vbox_ttm.c
@@ -241,7 +241,6 @@ static struct ttm_bo_driver vbox_bo_driver = {
.verify_access = vbox_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int vbox_mm_init(struct vbox_private *vbox)
-- 
2.7.4

[PATCH v2 00/12] drm: add check if io_mem_pfn is NULL and cleanup

2017-12-24 Thread Tan Xiaojun

I found an OOPS when I used the mainline kernel for graphical tests in Hisilicon
D05, I do not know how to solve this problem until I saw your discussion on this
issue a month ago:

https://lists.freedesktop.org/archives/dri-devel/2017-November/159046.html

And my problem can be solved perfectly by your solution.

This is important for me, I want to solve this problem as soon as possible. So
I follow the result of your discussion, make and send these patches below.

If anything is not good, please point it out, thanks.

Change logs of v2:
 * add new function to instead of ttm_bo_default_io_mem_pfn() and
   do some cleanup.

Tan Xiaojun (12):
  drm/ttm: add ttm_bo_io_mem_pfn to check io_mem_pfn
  drm/ast: remove the default io_mem_pfn set
  drm/bochs: remove the default io_mem_pfn set
  drm/cirrus: remove the default io_mem_pfn set
  drm/mgag200: remove the default io_mem_pfn set
  drm/nouveau: remove the default io_mem_pfn set
  drm/qxl: remove the default io_mem_pfn set
  drm/radeon: remove the default io_mem_pfn set
  drm/virtio: remove the default io_mem_pfn set
  drm/vmwgfx: remove the default io_mem_pfn set
  staging: remove the default io_mem_pfn set
  drm/ttm: remove ttm_bo_default_io_mem_pfn

 drivers/gpu/drm/ast/ast_ttm.c  |  1 -
 drivers/gpu/drm/bochs/bochs_mm.c   |  1 -
 drivers/gpu/drm/cirrus/cirrus_ttm.c|  1 -
 drivers/gpu/drm/mgag200/mgag200_ttm.c  |  1 -
 drivers/gpu/drm/nouveau/nouveau_bo.c   |  1 -
 drivers/gpu/drm/qxl/qxl_ttm.c  |  1 -
 drivers/gpu/drm/radeon/radeon_ttm.c|  1 -
 drivers/gpu/drm/ttm/ttm_bo_vm.c| 22 +-
 drivers/gpu/drm/virtio/virtgpu_ttm.c   |  1 -
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c |  1 -
 drivers/staging/vboxvideo/vbox_ttm.c   |  1 -
 include/drm/ttm/ttm_bo_api.h   | 11 ---
 12 files changed, 13 insertions(+), 30 deletions(-)

-- 
2.7.4

[PATCH v2 00/12] drm: add check if io_mem_pfn is NULL and cleanup

2017-12-24 Thread Tan Xiaojun

I found an OOPS when I used the mainline kernel for graphical tests in Hisilicon
D05, I do not know how to solve this problem until I saw your discussion on this
issue a month ago:

https://lists.freedesktop.org/archives/dri-devel/2017-November/159046.html

And my problem can be solved perfectly by your solution.

This is important for me, I want to solve this problem as soon as possible. So
I follow the result of your discussion, make and send these patches below.

If anything is not good, please point it out, thanks.

Change logs of v2:
 * add new function to instead of ttm_bo_default_io_mem_pfn() and
   do some cleanup.

Tan Xiaojun (12):
  drm/ttm: add ttm_bo_io_mem_pfn to check io_mem_pfn
  drm/ast: remove the default io_mem_pfn set
  drm/bochs: remove the default io_mem_pfn set
  drm/cirrus: remove the default io_mem_pfn set
  drm/mgag200: remove the default io_mem_pfn set
  drm/nouveau: remove the default io_mem_pfn set
  drm/qxl: remove the default io_mem_pfn set
  drm/radeon: remove the default io_mem_pfn set
  drm/virtio: remove the default io_mem_pfn set
  drm/vmwgfx: remove the default io_mem_pfn set
  staging: remove the default io_mem_pfn set
  drm/ttm: remove ttm_bo_default_io_mem_pfn

 drivers/gpu/drm/ast/ast_ttm.c  |  1 -
 drivers/gpu/drm/bochs/bochs_mm.c   |  1 -
 drivers/gpu/drm/cirrus/cirrus_ttm.c|  1 -
 drivers/gpu/drm/mgag200/mgag200_ttm.c  |  1 -
 drivers/gpu/drm/nouveau/nouveau_bo.c   |  1 -
 drivers/gpu/drm/qxl/qxl_ttm.c  |  1 -
 drivers/gpu/drm/radeon/radeon_ttm.c|  1 -
 drivers/gpu/drm/ttm/ttm_bo_vm.c| 22 +-
 drivers/gpu/drm/virtio/virtgpu_ttm.c   |  1 -
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c |  1 -
 drivers/staging/vboxvideo/vbox_ttm.c   |  1 -
 include/drm/ttm/ttm_bo_api.h   | 11 ---
 12 files changed, 13 insertions(+), 30 deletions(-)

-- 
2.7.4

[PATCH v2 04/12] drm/cirrus: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/cirrus/cirrus_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c 
b/drivers/gpu/drm/cirrus/cirrus_ttm.c
index 1ff1838..2c652af 100644
--- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
+++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
@@ -237,7 +237,6 @@ struct ttm_bo_driver cirrus_bo_driver = {
.verify_access = cirrus_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int cirrus_mm_init(struct cirrus_device *cirrus)
-- 
2.7.4

[PATCH v2 04/12] drm/cirrus: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/cirrus/cirrus_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c 
b/drivers/gpu/drm/cirrus/cirrus_ttm.c
index 1ff1838..2c652af 100644
--- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
+++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
@@ -237,7 +237,6 @@ struct ttm_bo_driver cirrus_bo_driver = {
.verify_access = cirrus_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int cirrus_mm_init(struct cirrus_device *cirrus)
-- 
2.7.4

[PATCH v2 05/12] drm/mgag200: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/mgag200/mgag200_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c 
b/drivers/gpu/drm/mgag200/mgag200_ttm.c
index 3e7e1cd..89b550f 100644
--- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
+++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
@@ -237,7 +237,6 @@ struct ttm_bo_driver mgag200_bo_driver = {
.verify_access = mgag200_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int mgag200_mm_init(struct mga_device *mdev)
-- 
2.7.4

[PATCH v2 05/12] drm/mgag200: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/mgag200/mgag200_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c 
b/drivers/gpu/drm/mgag200/mgag200_ttm.c
index 3e7e1cd..89b550f 100644
--- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
+++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
@@ -237,7 +237,6 @@ struct ttm_bo_driver mgag200_bo_driver = {
.verify_access = mgag200_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int mgag200_mm_init(struct mga_device *mdev)
-- 
2.7.4

[PATCH v2 03/12] drm/bochs: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/bochs/bochs_mm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/bochs/bochs_mm.c b/drivers/gpu/drm/bochs/bochs_mm.c
index c4cadb6..857755a 100644
--- a/drivers/gpu/drm/bochs/bochs_mm.c
+++ b/drivers/gpu/drm/bochs/bochs_mm.c
@@ -205,7 +205,6 @@ struct ttm_bo_driver bochs_bo_driver = {
.verify_access = bochs_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int bochs_mm_init(struct bochs_device *bochs)
-- 
2.7.4

[PATCH v2 06/12] drm/nouveau: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 435ff86..8de82a3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1667,5 +1667,4 @@ struct ttm_bo_driver nouveau_bo_driver = {
.fault_reserve_notify = _ttm_fault_reserve_notify,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
-- 
2.7.4

[PATCH v2 03/12] drm/bochs: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/bochs/bochs_mm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/bochs/bochs_mm.c b/drivers/gpu/drm/bochs/bochs_mm.c
index c4cadb6..857755a 100644
--- a/drivers/gpu/drm/bochs/bochs_mm.c
+++ b/drivers/gpu/drm/bochs/bochs_mm.c
@@ -205,7 +205,6 @@ struct ttm_bo_driver bochs_bo_driver = {
.verify_access = bochs_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int bochs_mm_init(struct bochs_device *bochs)
-- 
2.7.4

[PATCH v2 06/12] drm/nouveau: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 435ff86..8de82a3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1667,5 +1667,4 @@ struct ttm_bo_driver nouveau_bo_driver = {
.fault_reserve_notify = _ttm_fault_reserve_notify,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
-- 
2.7.4

[PATCH v2 10/12] drm/vmwgfx: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
index c705632..828dd59 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
@@ -859,5 +859,4 @@ struct ttm_bo_driver vmw_bo_driver = {
.fault_reserve_notify = _ttm_fault_reserve_notify,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
-- 
2.7.4

[PATCH v2 10/12] drm/vmwgfx: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
index c705632..828dd59 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
@@ -859,5 +859,4 @@ struct ttm_bo_driver vmw_bo_driver = {
.fault_reserve_notify = _ttm_fault_reserve_notify,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
-- 
2.7.4

[PATCH v2 02/12] drm/ast: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/ast/ast_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 696a15d..fdd521d 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -237,7 +237,6 @@ struct ttm_bo_driver ast_bo_driver = {
.verify_access = ast_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int ast_mm_init(struct ast_private *ast)
-- 
2.7.4

[PATCH v2 02/12] drm/ast: remove the default io_mem_pfn set

2017-12-24 Thread Tan Xiaojun

The default interface situation has been taken into the framework, so
remove the default set of each module.

Signed-off-by: Tan Xiaojun 
---
 drivers/gpu/drm/ast/ast_ttm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 696a15d..fdd521d 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -237,7 +237,6 @@ struct ttm_bo_driver ast_bo_driver = {
.verify_access = ast_bo_verify_access,
.io_mem_reserve = _ttm_io_mem_reserve,
.io_mem_free = _ttm_io_mem_free,
-   .io_mem_pfn = ttm_bo_default_io_mem_pfn,
 };
 
 int ast_mm_init(struct ast_private *ast)
-- 
2.7.4

Re: [linux-sunxi] [PATCH v4 0/2] Initial Allwinner V3s CSI Support

2017-12-24 Thread Yong

Hi,

On Fri, 22 Dec 2017 14:46:48 +0100
Ondřej Jirman  wrote:

> Hello,
> 
> Yong Deng píše v Pá 22. 12. 2017 v 17:32 +0800:
> > This patchset add initial support for Allwinner V3s CSI.
> > 
> > Allwinner V3s SoC have two CSI module. CSI0 is used for MIPI interface
> > and CSI1 is used for parallel interface. This is not documented in
> > datasheet but by testing and guess.
> > 
> > This patchset implement a v4l2 framework driver and add a binding 
> > documentation for it. 
> > 
> > Currently, the driver only support the parallel interface. And has been
> > tested with a BT1120 signal which generating from FPGA. The following
> > fetures are not support with this patchset:
> >   - ISP 
> >   - MIPI-CSI2
> >   - Master clock for camera sensor
> >   - Power regulator for the front end IC
> > 
> > Thanks for Ondřej Jirman's help.
> > 
> > Changes in v4:
> >   * Deal with the CSI 'INNER QUEUE'.
> > CSI will lookup the next dma buffer for next frame before the
> > the current frame done IRQ triggered. This is not documented
> > but reported by Ondřej Jirman.
> > The BSP code has workaround for this too. It skip to mark the
> > first buffer as frame done for VB2 and pass the second buffer
> > to CSI in the first frame done ISR call. Then in second frame
> > done ISR call, it mark the first buffer as frame done for VB2
> > and pass the third buffer to CSI. And so on. The bad thing is
> > that the first buffer will be written twice and the first frame
> > is dropped even the queued buffer is sufficient.
> > So, I make some improvement here. Pass the next buffer to CSI
> > just follow starting the CSI. In this case, the first frame
> > will be stored in first buffer, second frame in second buffer.
> > This mothed is used to avoid dropping the first frame, it
> > would also drop frame when lacking of queued buffer.
> >   * Fix: using a wrong mbus_code when getting the supported formats
> >   * Change all fourcc to pixformat
> >   * Change some function names
> > 
> > Changes in v3:
> >   * Get rid of struct sun6i_csi_ops
> >   * Move sun6i-csi to new directory drivers/media/platform/sunxi
> >   * Merge sun6i_csi.c and sun6i_csi_v3s.c into sun6i_csi.c
> >   * Use generic fwnode endpoints parser
> >   * Only support a single subdev to make things simple
> >   * Many complaintion fix
> > 
> > Changes in v2: 
> >   * Change sunxi-csi to sun6i-csi
> >   * Rebase to media_tree master branch 
> > 
> > Following is the 'v4l2-compliance -s -f' output, I have test this
> > with both interlaced and progressive signal:
> > 
> > # ./v4l2-compliance -s -f
> > v4l2-compliance SHA   : 6049ea8bd64f9d78ef87ef0c2b3dc9b5de1ca4a1
> > 
> > Driver Info:
> > Driver name   : sun6i-video
> > Card type : sun6i-csi
> > Bus info  : platform:csi
> > Driver version: 4.15.0
> > Capabilities  : 0x8421
> > Video Capture
> > Streaming
> > Extended Pix Format
> > Device Capabilities
> > Device Caps   : 0x0421
> > Video Capture
> > Streaming
> > Extended Pix Format
> > 
> > Compliance test for device /dev/video0 (not using libv4l2):
> > 
> > Required ioctls:
> > test VIDIOC_QUERYCAP: OK
> > 
> > Allow for multiple opens:
> > test second video open: OK
> > test VIDIOC_QUERYCAP: OK
> > test VIDIOC_G/S_PRIORITY: OK
> > test for unlimited opens: OK
> > 
> > Debug ioctls:
> > test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported)
> > test VIDIOC_LOG_STATUS: OK (Not Supported)
> > 
> > Input ioctls:
> > test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
> > test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
> > test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
> > test VIDIOC_ENUMAUDIO: OK (Not Supported)
> > test VIDIOC_G/S/ENUMINPUT: OK
> > test VIDIOC_G/S_AUDIO: OK (Not Supported)
> > Inputs: 1 Audio Inputs: 0 Tuners: 0
> > 
> > Output ioctls:
> > test VIDIOC_G/S_MODULATOR: OK (Not Supported)
> > test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
> > test VIDIOC_ENUMAUDOUT: OK (Not Supported)
> > test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
> > test VIDIOC_G/S_AUDOUT: OK (Not Supported)
> > Outputs: 0 Audio Outputs: 0 Modulators: 0
> > 
> > Input/Output configuration ioctls:
> > test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
> > test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
> > test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
> > test VIDIOC_G/S_EDID: OK (Not Supported)
> > 
> > Test input 0:
> > 
> > Control ioctls:
> > test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK (Not Supported)
> > test VIDIOC_QUERYCTRL: OK (Not Supported)
> > test VIDIOC_G/S_CTRL: OK (Not

Re: [linux-sunxi] [PATCH v4 0/2] Initial Allwinner V3s CSI Support

2017-12-24 Thread Yong

Hi,

On Fri, 22 Dec 2017 14:46:48 +0100
Ondřej Jirman  wrote:

> Hello,
> 
> Yong Deng píše v Pá 22. 12. 2017 v 17:32 +0800:
> > This patchset add initial support for Allwinner V3s CSI.
> > 
> > Allwinner V3s SoC have two CSI module. CSI0 is used for MIPI interface
> > and CSI1 is used for parallel interface. This is not documented in
> > datasheet but by testing and guess.
> > 
> > This patchset implement a v4l2 framework driver and add a binding 
> > documentation for it. 
> > 
> > Currently, the driver only support the parallel interface. And has been
> > tested with a BT1120 signal which generating from FPGA. The following
> > fetures are not support with this patchset:
> >   - ISP 
> >   - MIPI-CSI2
> >   - Master clock for camera sensor
> >   - Power regulator for the front end IC
> > 
> > Thanks for Ondřej Jirman's help.
> > 
> > Changes in v4:
> >   * Deal with the CSI 'INNER QUEUE'.
> > CSI will lookup the next dma buffer for next frame before the
> > the current frame done IRQ triggered. This is not documented
> > but reported by Ondřej Jirman.
> > The BSP code has workaround for this too. It skip to mark the
> > first buffer as frame done for VB2 and pass the second buffer
> > to CSI in the first frame done ISR call. Then in second frame
> > done ISR call, it mark the first buffer as frame done for VB2
> > and pass the third buffer to CSI. And so on. The bad thing is
> > that the first buffer will be written twice and the first frame
> > is dropped even the queued buffer is sufficient.
> > So, I make some improvement here. Pass the next buffer to CSI
> > just follow starting the CSI. In this case, the first frame
> > will be stored in first buffer, second frame in second buffer.
> > This mothed is used to avoid dropping the first frame, it
> > would also drop frame when lacking of queued buffer.
> >   * Fix: using a wrong mbus_code when getting the supported formats
> >   * Change all fourcc to pixformat
> >   * Change some function names
> > 
> > Changes in v3:
> >   * Get rid of struct sun6i_csi_ops
> >   * Move sun6i-csi to new directory drivers/media/platform/sunxi
> >   * Merge sun6i_csi.c and sun6i_csi_v3s.c into sun6i_csi.c
> >   * Use generic fwnode endpoints parser
> >   * Only support a single subdev to make things simple
> >   * Many complaintion fix
> > 
> > Changes in v2: 
> >   * Change sunxi-csi to sun6i-csi
> >   * Rebase to media_tree master branch 
> > 
> > Following is the 'v4l2-compliance -s -f' output, I have test this
> > with both interlaced and progressive signal:
> > 
> > # ./v4l2-compliance -s -f
> > v4l2-compliance SHA   : 6049ea8bd64f9d78ef87ef0c2b3dc9b5de1ca4a1
> > 
> > Driver Info:
> > Driver name   : sun6i-video
> > Card type : sun6i-csi
> > Bus info  : platform:csi
> > Driver version: 4.15.0
> > Capabilities  : 0x8421
> > Video Capture
> > Streaming
> > Extended Pix Format
> > Device Capabilities
> > Device Caps   : 0x0421
> > Video Capture
> > Streaming
> > Extended Pix Format
> > 
> > Compliance test for device /dev/video0 (not using libv4l2):
> > 
> > Required ioctls:
> > test VIDIOC_QUERYCAP: OK
> > 
> > Allow for multiple opens:
> > test second video open: OK
> > test VIDIOC_QUERYCAP: OK
> > test VIDIOC_G/S_PRIORITY: OK
> > test for unlimited opens: OK
> > 
> > Debug ioctls:
> > test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported)
> > test VIDIOC_LOG_STATUS: OK (Not Supported)
> > 
> > Input ioctls:
> > test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
> > test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
> > test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
> > test VIDIOC_ENUMAUDIO: OK (Not Supported)
> > test VIDIOC_G/S/ENUMINPUT: OK
> > test VIDIOC_G/S_AUDIO: OK (Not Supported)
> > Inputs: 1 Audio Inputs: 0 Tuners: 0
> > 
> > Output ioctls:
> > test VIDIOC_G/S_MODULATOR: OK (Not Supported)
> > test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
> > test VIDIOC_ENUMAUDOUT: OK (Not Supported)
> > test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
> > test VIDIOC_G/S_AUDOUT: OK (Not Supported)
> > Outputs: 0 Audio Outputs: 0 Modulators: 0
> > 
> > Input/Output configuration ioctls:
> > test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
> > test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
> > test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
> > test VIDIOC_G/S_EDID: OK (Not Supported)
> > 
> > Test input 0:
> > 
> > Control ioctls:
> > test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK (Not Supported)
> > test VIDIOC_QUERYCTRL: OK (Not Supported)
> > test VIDIOC_G/S_CTRL: OK (Not Supported)
> >

Re: [PATCH 4/4] KVM: nVMX: initialize more non-shadowed fields in prepare_vmcs02_full

2017-12-24 Thread Wanpeng Li

2017-12-21 20:43 GMT+08:00 Paolo Bonzini :
> These fields are also simple copies of the data in the vmcs12 struct.
> For some of them, prepare_vmcs02 was skipping the copy when the field
> was unused.  In prepare_vmcs02_full, we copy them always as long as the
> field exists on the host, because the corresponding execution control
> might be one of the shadowed fields.

Why we don't need to copy them always before the patchset?

Regards,
Wanpeng Li

Re: [PATCH 4/4] KVM: nVMX: initialize more non-shadowed fields in prepare_vmcs02_full

2017-12-24 Thread Wanpeng Li

2017-12-21 20:43 GMT+08:00 Paolo Bonzini :
> These fields are also simple copies of the data in the vmcs12 struct.
> For some of them, prepare_vmcs02 was skipping the copy when the field
> was unused.  In prepare_vmcs02_full, we copy them always as long as the
> field exists on the host, because the corresponding execution control
> might be one of the shadowed fields.

Why we don't need to copy them always before the patchset?

Regards,
Wanpeng Li

Re: [PATCH 2/4] KVM: nVMX: track dirty state of non-shadowed VMCS fields

2017-12-24 Thread Wanpeng Li

2017-12-21 20:43 GMT+08:00 Paolo Bonzini :
> VMCS12 fields that are not handled through shadow VMCS are rarely
> written, and thus they are also almost constant in the vmcs02.  We can
> thus optimize prepare_vmcs02 by skipping all the work for non-shadowed
> fields in the common case.
>
> This patch introduces the (pretty simple) tracking infrastructure; the
> next patches will move work to prepare_vmcs02_full and save a few hundred
> clock cycles per VMRESUME on a Haswell Xeon E5 system:
>
> before  after
> cpuid   14159   13869
> vmcall  15290   14951
> inl_from_kernel 17703   17447
> outl_to_kernel  16011   14692
> self_ipi_sti_nop16763   15825
> self_ipi_tpr_sti_nop17341   15935
> wr_tsc_adjust_msr   14510   14264
> rd_tsc_adjust_msr   15018   14311
> mmio-wildcard-eventfd:pci-mem   16381   14947
> mmio-datamatch-eventfd:pci-mem  18620   17858
> portio-wildcard-eventfd:pci-io  15121   14769
> portio-datamatch-eventfd:pci-io 15761   14831
>
> (average savings 748, stdev 460).
>
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/x86/kvm/vmx.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 2ee842990976..8b6013b529b3 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -441,6 +441,7 @@ struct nested_vmx {
>  * data hold by vmcs12
>  */
> bool sync_shadow_vmcs;
> +   bool dirty_vmcs12;
>
> bool change_vmcs01_virtual_x2apic_mode;
> /* L2 must run next, and mustn't decide to exit to L1. */
> @@ -7879,8 +7880,10 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
>  {
> unsigned long field;
> gva_t gva;
> +   struct vcpu_vmx *vmx = to_vmx(vcpu);
> unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
> u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
> +
> /* The value to write might be 32 or 64 bits, depending on L1's long
>  * mode, and eventually we need to write that into a field of several
>  * possible lengths. The code below first zero-extends the value to 64
> @@ -7923,6 +7926,20 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
> return kvm_skip_emulated_instruction(vcpu);
> }
>
> +   switch (field) {
> +#define SHADOW_FIELD_RW(x) case x:
> +#include "vmx_shadow_fields.h"

What's will happen here if enable_shadow_vmcs == false?

Regards,
Wanpeng Li

> +   /*
> +* The fields that can be updated by L1 without a vmexit are
> +* always updated in the vmcs02, the others go down the slow
> +* path of prepare_vmcs02.
> +*/
> +   break;
> +   default:
> +   vmx->nested.dirty_vmcs12 = true;
> +   break;
> +   }
> +
> nested_vmx_succeed(vcpu);
> return kvm_skip_emulated_instruction(vcpu);
>  }
> @@ -7937,6 +7954,7 @@ static void set_current_vmptr(struct vcpu_vmx *vmx, 
> gpa_t vmptr)
>  __pa(vmx->vmcs01.shadow_vmcs));
> vmx->nested.sync_shadow_vmcs = true;
> }
> +   vmx->nested.dirty_vmcs12 = true;
>  }
>
>  /* Emulate the VMPTRLD instruction */
> @@ -10569,6 +10587,11 @@ static int nested_vmx_load_cr3(struct kvm_vcpu 
> *vcpu, unsigned long cr3, bool ne
> return 0;
>  }
>
> +static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> +  bool from_vmentry)
> +{
> +}
> +
>  /*
>   * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
>   * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it
> @@ -10864,7 +10887,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, 
> struct vmcs12 *vmcs12,
> vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
> vmx_flush_tlb(vcpu, true);
> }
> -
> }
>
> if (enable_pml) {
> @@ -10913,6 +10935,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, 
> struct vmcs12 *vmcs12,
> /* Note: modifies VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */
> vmx_set_efer(vcpu, vcpu->arch.efer);
>
> +   if (vmx->nested.dirty_vmcs12) {
> +   prepare_vmcs02_full(vcpu, vmcs12, from_vmentry);
> +   vmx->nested.dirty_vmcs12 = false;
> +   }
> +
> /* Shadow page tables on either EPT or shadow page tables. */
> if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, 
> nested_cpu_has_ept(vmcs12),
> entry_failure_code))
> --
> 1.8.3.1
>
>

Re: [PATCH 2/4] KVM: nVMX: track dirty state of non-shadowed VMCS fields

2017-12-24 Thread Wanpeng Li

2017-12-21 20:43 GMT+08:00 Paolo Bonzini :
> VMCS12 fields that are not handled through shadow VMCS are rarely
> written, and thus they are also almost constant in the vmcs02.  We can
> thus optimize prepare_vmcs02 by skipping all the work for non-shadowed
> fields in the common case.
>
> This patch introduces the (pretty simple) tracking infrastructure; the
> next patches will move work to prepare_vmcs02_full and save a few hundred
> clock cycles per VMRESUME on a Haswell Xeon E5 system:
>
> before  after
> cpuid   14159   13869
> vmcall  15290   14951
> inl_from_kernel 17703   17447
> outl_to_kernel  16011   14692
> self_ipi_sti_nop16763   15825
> self_ipi_tpr_sti_nop17341   15935
> wr_tsc_adjust_msr   14510   14264
> rd_tsc_adjust_msr   15018   14311
> mmio-wildcard-eventfd:pci-mem   16381   14947
> mmio-datamatch-eventfd:pci-mem  18620   17858
> portio-wildcard-eventfd:pci-io  15121   14769
> portio-datamatch-eventfd:pci-io 15761   14831
>
> (average savings 748, stdev 460).
>
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/x86/kvm/vmx.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 2ee842990976..8b6013b529b3 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -441,6 +441,7 @@ struct nested_vmx {
>  * data hold by vmcs12
>  */
> bool sync_shadow_vmcs;
> +   bool dirty_vmcs12;
>
> bool change_vmcs01_virtual_x2apic_mode;
> /* L2 must run next, and mustn't decide to exit to L1. */
> @@ -7879,8 +7880,10 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
>  {
> unsigned long field;
> gva_t gva;
> +   struct vcpu_vmx *vmx = to_vmx(vcpu);
> unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
> u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
> +
> /* The value to write might be 32 or 64 bits, depending on L1's long
>  * mode, and eventually we need to write that into a field of several
>  * possible lengths. The code below first zero-extends the value to 64
> @@ -7923,6 +7926,20 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
> return kvm_skip_emulated_instruction(vcpu);
> }
>
> +   switch (field) {
> +#define SHADOW_FIELD_RW(x) case x:
> +#include "vmx_shadow_fields.h"

What's will happen here if enable_shadow_vmcs == false?

Regards,
Wanpeng Li

> +   /*
> +* The fields that can be updated by L1 without a vmexit are
> +* always updated in the vmcs02, the others go down the slow
> +* path of prepare_vmcs02.
> +*/
> +   break;
> +   default:
> +   vmx->nested.dirty_vmcs12 = true;
> +   break;
> +   }
> +
> nested_vmx_succeed(vcpu);
> return kvm_skip_emulated_instruction(vcpu);
>  }
> @@ -7937,6 +7954,7 @@ static void set_current_vmptr(struct vcpu_vmx *vmx, 
> gpa_t vmptr)
>  __pa(vmx->vmcs01.shadow_vmcs));
> vmx->nested.sync_shadow_vmcs = true;
> }
> +   vmx->nested.dirty_vmcs12 = true;
>  }
>
>  /* Emulate the VMPTRLD instruction */
> @@ -10569,6 +10587,11 @@ static int nested_vmx_load_cr3(struct kvm_vcpu 
> *vcpu, unsigned long cr3, bool ne
> return 0;
>  }
>
> +static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
> +  bool from_vmentry)
> +{
> +}
> +
>  /*
>   * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
>   * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it
> @@ -10864,7 +10887,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, 
> struct vmcs12 *vmcs12,
> vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
> vmx_flush_tlb(vcpu, true);
> }
> -
> }
>
> if (enable_pml) {
> @@ -10913,6 +10935,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, 
> struct vmcs12 *vmcs12,
> /* Note: modifies VM_ENTRY/EXIT_CONTROLS and GUEST/HOST_IA32_EFER */
> vmx_set_efer(vcpu, vcpu->arch.efer);
>
> +   if (vmx->nested.dirty_vmcs12) {
> +   prepare_vmcs02_full(vcpu, vmcs12, from_vmentry);
> +   vmx->nested.dirty_vmcs12 = false;
> +   }
> +
> /* Shadow page tables on either EPT or shadow page tables. */
> if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, 
> nested_cpu_has_ept(vmcs12),
> entry_failure_code))
> --
> 1.8.3.1
>
>

RE: [PATCH] usb: dwc3: gadget:Core consumes a trb software to fill a trb, in ISO

2017-12-24 Thread Lipengcheng

Hi,

> -Original Message-
> From: Felipe Balbi [mailto:ba...@kernel.org]
> Sent: Friday, December 22, 2017 3:54 PM
> To: Lipengcheng
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Lipengcheng
> Subject: Re: [PATCH] usb: dwc3: gadget:Core consumes a trb software to fill a 
> trb, in ISO
> 
> 
> Hi,
> 
> Lipengcheng  writes:
> 
> > Iso transmission, the current process is that all trb(HWO=1) is handled.
> > Then core generate DWC3_DEPEVT_XFERNOTREADY event, Software begin
> > refill trb, this will produce 0 length package, the patch is to
> > achieve the core consumes a trb, and then the software fill a trb.
> > Normally, there will never be DWC3_DEPEVT_XFERNOTREADY event and 0-length 
> > packet.
> >
> > Signed-off-by: l00229106 
> 
> who is 100229106??
Sorry. It is my job number. I will use Pengcheng li to replace it.
> 
> > ---
> >  drivers/usb/dwc3/gadget.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 981fd98..1e6c42e 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -2420,7 +2420,7 @@ static void dwc3_endpoint_transfer_complete(struct 
> > dwc3 *dwc,
> > if (!dep->endpoint.desc)
> > return;
> >
> > -   if (!usb_endpoint_xfer_isoc(dep->endpoint.desc))
> > +   if (!usb_endpoint_xfer_isoc(dep->endpoint.desc) || (dep->flags &
> > + DWC3_EP_TRANSFER_STARTED))
> 
> this is wrong. isoc endpoints should NEVER be prestarted.
The main purpose is to core handle a trb and sofware re-fill the next trb in 
the DWC3_DEPEVT_XFERINPROGRESS interrupt. Mayebe it can be modified:
if (!usb_endpoint_xfer_isoc(dep->endpoint.desc))
__dwc3_gadget_kick_transfer(dep);
+   else
+   dwc3_prepare_trbs(dep);
+
> 
> --
> balbi

RE: [PATCH] usb: dwc3: gadget:Core consumes a trb software to fill a trb, in ISO

2017-12-24 Thread Lipengcheng

Hi,

> -Original Message-
> From: Felipe Balbi [mailto:ba...@kernel.org]
> Sent: Friday, December 22, 2017 3:54 PM
> To: Lipengcheng
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Lipengcheng
> Subject: Re: [PATCH] usb: dwc3: gadget:Core consumes a trb software to fill a 
> trb, in ISO
> 
> 
> Hi,
> 
> Lipengcheng  writes:
> 
> > Iso transmission, the current process is that all trb(HWO=1) is handled.
> > Then core generate DWC3_DEPEVT_XFERNOTREADY event, Software begin
> > refill trb, this will produce 0 length package, the patch is to
> > achieve the core consumes a trb, and then the software fill a trb.
> > Normally, there will never be DWC3_DEPEVT_XFERNOTREADY event and 0-length 
> > packet.
> >
> > Signed-off-by: l00229106 
> 
> who is 100229106??
Sorry. It is my job number. I will use Pengcheng li to replace it.
> 
> > ---
> >  drivers/usb/dwc3/gadget.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 981fd98..1e6c42e 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -2420,7 +2420,7 @@ static void dwc3_endpoint_transfer_complete(struct 
> > dwc3 *dwc,
> > if (!dep->endpoint.desc)
> > return;
> >
> > -   if (!usb_endpoint_xfer_isoc(dep->endpoint.desc))
> > +   if (!usb_endpoint_xfer_isoc(dep->endpoint.desc) || (dep->flags &
> > + DWC3_EP_TRANSFER_STARTED))
> 
> this is wrong. isoc endpoints should NEVER be prestarted.
The main purpose is to core handle a trb and sofware re-fill the next trb in 
the DWC3_DEPEVT_XFERINPROGRESS interrupt. Mayebe it can be modified:
if (!usb_endpoint_xfer_isoc(dep->endpoint.desc))
__dwc3_gadget_kick_transfer(dep);
+   else
+   dwc3_prepare_trbs(dep);
+
> 
> --
> balbi

[PATCH] x86/kconfig: remove residual cmdline param "no-hlt"

2017-12-24 Thread zhenwei.pi

cmdline param "no-hlt" has been removed in commit
27be457000211a6903968dfce06d5f73f051a217.

Signed-off-by: zhenwei.pi 
---
 arch/x86/Kconfig | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d4fc98c..fa9c33c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2426,19 +2426,18 @@ menuconfig APM
 
  1) make sure that you have enough swap space and that it is
  enabled.
- 2) pass the "no-hlt" option to the kernel
- 3) switch on floating point emulation in the kernel and pass
+ 2) switch on floating point emulation in the kernel and pass
  the "no387" option to the kernel
- 4) pass the "floppy=nodma" option to the kernel
- 5) pass the "mem=4M" option to the kernel (thereby disabling
+ 3) pass the "floppy=nodma" option to the kernel
+ 4) pass the "mem=4M" option to the kernel (thereby disabling
  all but the first 4 MB of RAM)
- 6) make sure that the CPU is not over clocked.
- 7) read the sig11 FAQ at 
- 8) disable the cache from your BIOS settings
- 9) install a fan for the video card or exchange video RAM
- 10) install a better fan for the CPU
- 11) exchange RAM chips
- 12) exchange the motherboard.
+ 5) make sure that the CPU is not over clocked.
+ 6) read the sig11 FAQ at 
+ 7) disable the cache from your BIOS settings
+ 8) install a fan for the video card or exchange video RAM
+ 9) install a better fan for the CPU
+ 10) exchange RAM chips
+ 11) exchange the motherboard.
 
  To compile this driver as a module, choose M here: the
  module will be called apm.
-- 
2.7.4

[PATCH] x86/kconfig: remove residual cmdline param "no-hlt"

2017-12-24 Thread zhenwei.pi

cmdline param "no-hlt" has been removed in commit
27be457000211a6903968dfce06d5f73f051a217.

Signed-off-by: zhenwei.pi 
---
 arch/x86/Kconfig | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d4fc98c..fa9c33c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2426,19 +2426,18 @@ menuconfig APM
 
  1) make sure that you have enough swap space and that it is
  enabled.
- 2) pass the "no-hlt" option to the kernel
- 3) switch on floating point emulation in the kernel and pass
+ 2) switch on floating point emulation in the kernel and pass
  the "no387" option to the kernel
- 4) pass the "floppy=nodma" option to the kernel
- 5) pass the "mem=4M" option to the kernel (thereby disabling
+ 3) pass the "floppy=nodma" option to the kernel
+ 4) pass the "mem=4M" option to the kernel (thereby disabling
  all but the first 4 MB of RAM)
- 6) make sure that the CPU is not over clocked.
- 7) read the sig11 FAQ at 
- 8) disable the cache from your BIOS settings
- 9) install a fan for the video card or exchange video RAM
- 10) install a better fan for the CPU
- 11) exchange RAM chips
- 12) exchange the motherboard.
+ 5) make sure that the CPU is not over clocked.
+ 6) read the sig11 FAQ at 
+ 7) disable the cache from your BIOS settings
+ 8) install a fan for the video card or exchange video RAM
+ 9) install a better fan for the CPU
+ 10) exchange RAM chips
+ 11) exchange the motherboard.
 
  To compile this driver as a module, choose M here: the
  module will be called apm.
-- 
2.7.4

Re: [PATCH] perf tool: Return all events as auto-completions after comma

2017-12-24 Thread Jin, Yao




One other thing you may want to look at:

   $ $ perf record -e cycles/

Should present the modifiers, i.e. these:

/*
  * Update according to parse-events.l
  */
static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 [PARSE_EVENTS__TERM_TYPE_USER]  = "",
 [PARSE_EVENTS__TERM_TYPE_CONFIG]= "config",
 [PARSE_EVENTS__TERM_TYPE_CONFIG1]   = "config1",
 [PARSE_EVENTS__TERM_TYPE_CONFIG2]   = "config2",
 [PARSE_EVENTS__TERM_TYPE_NAME]  = "name",
 [PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD] = "period",
 [PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ]   = "freq",
 [PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE]= "branch_type",
 [PARSE_EVENTS__TERM_TYPE_TIME]  = "time",
 [PARSE_EVENTS__TERM_TYPE_CALLGRAPH] = "call-graph",
 [PARSE_EVENTS__TERM_TYPE_STACKSIZE] = "stack-size",
 [PARSE_EVENTS__TERM_TYPE_NOINHERIT] = "no-inherit",
 [PARSE_EVENTS__TERM_TYPE_INHERIT]   = "inherit",
 [PARSE_EVENTS__TERM_TYPE_MAX_STACK] = "max-stack",
 [PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
 [PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]   = "no-overwrite",
 [PARSE_EVENTS__TERM_TYPE_DRV_CFG]   = "driver-config",
};

:-)

- Arnaldo



Hi Arnaldo,

Currently, in my understanding, the modifiers appended to an event are:

u/k/h/I/G/H/p/...

For example,
perf stat -e cycles:u

Does perf support the modifiers like "cycles/config" or 
"cycles/config1", or ..., "cycles/driver-config" now?


I tried some command lines but they were failed. Maybe my used format 
was not correct.


Or do you mean the format like "cpu/xxx"? For example,
perf stat -e cpu/event=0x0e,umask=0x1,inv/ -a sleep 1

Anyway, if we want to implement the auto-completion for the modifiers, 
it'd better expose them by an interface (e.g. perf list --xx) rather 
than hardcode them in auto-completion script. That's my initial idea.


Thanks
Jin Yao

Re: [PATCH] perf tool: Return all events as auto-completions after comma

2017-12-24 Thread Jin, Yao




One other thing you may want to look at:

   $ $ perf record -e cycles/

Should present the modifiers, i.e. these:

/*
  * Update according to parse-events.l
  */
static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 [PARSE_EVENTS__TERM_TYPE_USER]  = "",
 [PARSE_EVENTS__TERM_TYPE_CONFIG]= "config",
 [PARSE_EVENTS__TERM_TYPE_CONFIG1]   = "config1",
 [PARSE_EVENTS__TERM_TYPE_CONFIG2]   = "config2",
 [PARSE_EVENTS__TERM_TYPE_NAME]  = "name",
 [PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD] = "period",
 [PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ]   = "freq",
 [PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE]= "branch_type",
 [PARSE_EVENTS__TERM_TYPE_TIME]  = "time",
 [PARSE_EVENTS__TERM_TYPE_CALLGRAPH] = "call-graph",
 [PARSE_EVENTS__TERM_TYPE_STACKSIZE] = "stack-size",
 [PARSE_EVENTS__TERM_TYPE_NOINHERIT] = "no-inherit",
 [PARSE_EVENTS__TERM_TYPE_INHERIT]   = "inherit",
 [PARSE_EVENTS__TERM_TYPE_MAX_STACK] = "max-stack",
 [PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
 [PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]   = "no-overwrite",
 [PARSE_EVENTS__TERM_TYPE_DRV_CFG]   = "driver-config",
};

:-)

- Arnaldo



Hi Arnaldo,

Currently, in my understanding, the modifiers appended to an event are:

u/k/h/I/G/H/p/...

For example,
perf stat -e cycles:u

Does perf support the modifiers like "cycles/config" or 
"cycles/config1", or ..., "cycles/driver-config" now?


I tried some command lines but they were failed. Maybe my used format 
was not correct.


Or do you mean the format like "cpu/xxx"? For example,
perf stat -e cpu/event=0x0e,umask=0x1,inv/ -a sleep 1

Anyway, if we want to implement the auto-completion for the modifiers, 
it'd better expose them by an interface (e.g. perf list --xx) rather 
than hardcode them in auto-completion script. That's my initial idea.


Thanks
Jin Yao

Re: [f2fs-dev] [PATCH v2] f2fs: add an ioctl to disable GC for specific file

2017-12-24 Thread Chao Yu

On 2017/12/20 6:52, Jaegeuk Kim wrote:
> On 12/18, Chao Yu wrote:
>> On 2017/12/15 3:50, Jaegeuk Kim wrote:
>>> On 12/12, Chao Yu wrote:
 Hi Jaegeuk,

 On 2017/12/9 3:37, Jaegeuk Kim wrote:
> Change log from v1:
>  - fix bug in error handling of ioctl 
>
> >From b905e03d8aad7d25ecaf9bde05411a68d3d2460e Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim 
> Date: Thu, 7 Dec 2017 16:25:39 -0800
> Subject: [PATCH] f2fs: add an ioctl to disable GC for specific file
>
> This patch gives a flag to disable GC on given file, which would be 
> useful, when
> user wants to keep its block map.

 Could you add some description about in which scenario userspace 
 application
 can use this ioctl, otherwise than android, other developers can get hint 
 about
 the usage of this interface, maybe later it can be used more wildly. ;)
>>>
>>> The usecase would be somewhat hacky tho, it looks like 1) building a block 
>>> map
>>> through fibmap, 2) storing the map in another partition, 3) using the map to
>>> overwrite data contents directly by low-level tools. In that case, we have 
>>> to
>>> keep its block locations.
>>
>> That's so hacky, it will be better to add simple description in comment log. 
>> ;)
> 
> Actually, I don't want to add this kind of hacky example. :P

No a big deal, anyway, I've know your purpose. ;)

> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/f2fs.h  | 17 
>  fs/f2fs/file.c  | 54 
> +
>  fs/f2fs/gc.c| 11 ++
>  fs/f2fs/gc.h|  2 ++
>  fs/f2fs/sysfs.c |  2 ++
>  include/linux/f2fs_fs.h |  1 +
>  6 files changed, 87 insertions(+)
>
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 82f1dc345505..dd76cbf02791 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -375,6 +375,8 @@ static inline bool __has_cursum_space(struct 
> f2fs_journal *journal,
>  #define F2FS_IOC_FSGETXATTR  FS_IOC_FSGETXATTR
>  #define F2FS_IOC_FSSETXATTR  FS_IOC_FSSETXATTR
>  
> +#define F2FS_IOC_SET_DONTMOVE_IO(F2FS_IOCTL_MAGIC, 13)
> +
>  struct f2fs_gc_range {
>   u32 sync;
>   u64 start;
> @@ -1129,6 +1131,9 @@ struct f2fs_sb_info {
>   /* threshold for converting bg victims for fg */
>   u64 fggc_threshold;
>  
> + /* threshold for dontmove gc trials */
> + u64 gc_dontmove;
> +
>   /* maximum # of trials to find a victim segment for SSR and GC */
>   unsigned int max_victim_search;
>  
> @@ -2104,6 +2109,7 @@ enum {
>   FI_HOT_DATA,/* indicate file is hot */
>   FI_EXTRA_ATTR,  /* indicate file has extra attribute */
>   FI_PROJ_INHERIT,/* indicate file inherits projectid */
> + FI_DONTMOVE,/* indicate file should not be gced */
>  };
>  
>  static inline void __mark_inode_dirty_flag(struct inode *inode,
> @@ -2117,6 +2123,7 @@ static inline void __mark_inode_dirty_flag(struct 
> inode *inode,
>   return;
>   case FI_DATA_EXIST:
>   case FI_INLINE_DOTS:
> + case FI_DONTMOVE:
>   f2fs_mark_inode_dirty_sync(inode, true);
>   }
>  }
> @@ -2225,6 +2232,8 @@ static inline void get_inline_info(struct inode 
> *inode, struct f2fs_inode *ri)
>   set_bit(FI_INLINE_DOTS, >flags);
>   if (ri->i_inline & F2FS_EXTRA_ATTR)
>   set_bit(FI_EXTRA_ATTR, >flags);
> + if (ri->i_inline & F2FS_DONTMOVE)
> + set_bit(FI_DONTMOVE, >flags);
>  }
>  
>  static inline void set_raw_inline(struct inode *inode, struct f2fs_inode 
> *ri)
> @@ -2243,6 +2252,8 @@ static inline void set_raw_inline(struct inode 
> *inode, struct f2fs_inode *ri)
>   ri->i_inline |= F2FS_INLINE_DOTS;
>   if (is_inode_flag_set(inode, FI_EXTRA_ATTR))
>   ri->i_inline |= F2FS_EXTRA_ATTR;
> + if (is_inode_flag_set(inode, FI_DONTMOVE))
> + ri->i_inline |= F2FS_DONTMOVE;
>  }
>  
>  static inline int f2fs_has_extra_attr(struct inode *inode)
> @@ -2288,6 +2299,11 @@ static inline int f2fs_has_inline_dots(struct 
> inode *inode)
>   return is_inode_flag_set(inode, FI_INLINE_DOTS);
>  }
>  
> +static inline bool f2fs_is_dontmove_file(struct inode *inode)
> +{
> + return is_inode_flag_set(inode, FI_DONTMOVE);
> +}
> +
>  static inline bool f2fs_is_atomic_file(struct inode *inode)
>  {
>   return is_inode_flag_set(inode, FI_ATOMIC_FILE);
> @@ -2503,6 +2519,7 @@ int truncate_hole(struct inode *inode, pgoff_t 
> pg_start, pgoff_t pg_end);
>  void truncate_data_blocks_range(struct dnode_of_data *dn, int count);
>  long f2fs_ioctl(struct file *filp,

Re: [f2fs-dev] [PATCH v2] f2fs: add an ioctl to disable GC for specific file

2017-12-24 Thread Chao Yu

On 2017/12/20 6:52, Jaegeuk Kim wrote:
> On 12/18, Chao Yu wrote:
>> On 2017/12/15 3:50, Jaegeuk Kim wrote:
>>> On 12/12, Chao Yu wrote:
 Hi Jaegeuk,

 On 2017/12/9 3:37, Jaegeuk Kim wrote:
> Change log from v1:
>  - fix bug in error handling of ioctl 
>
> >From b905e03d8aad7d25ecaf9bde05411a68d3d2460e Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim 
> Date: Thu, 7 Dec 2017 16:25:39 -0800
> Subject: [PATCH] f2fs: add an ioctl to disable GC for specific file
>
> This patch gives a flag to disable GC on given file, which would be 
> useful, when
> user wants to keep its block map.

 Could you add some description about in which scenario userspace 
 application
 can use this ioctl, otherwise than android, other developers can get hint 
 about
 the usage of this interface, maybe later it can be used more wildly. ;)
>>>
>>> The usecase would be somewhat hacky tho, it looks like 1) building a block 
>>> map
>>> through fibmap, 2) storing the map in another partition, 3) using the map to
>>> overwrite data contents directly by low-level tools. In that case, we have 
>>> to
>>> keep its block locations.
>>
>> That's so hacky, it will be better to add simple description in comment log. 
>> ;)
> 
> Actually, I don't want to add this kind of hacky example. :P

No a big deal, anyway, I've know your purpose. ;)

> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/f2fs.h  | 17 
>  fs/f2fs/file.c  | 54 
> +
>  fs/f2fs/gc.c| 11 ++
>  fs/f2fs/gc.h|  2 ++
>  fs/f2fs/sysfs.c |  2 ++
>  include/linux/f2fs_fs.h |  1 +
>  6 files changed, 87 insertions(+)
>
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 82f1dc345505..dd76cbf02791 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -375,6 +375,8 @@ static inline bool __has_cursum_space(struct 
> f2fs_journal *journal,
>  #define F2FS_IOC_FSGETXATTR  FS_IOC_FSGETXATTR
>  #define F2FS_IOC_FSSETXATTR  FS_IOC_FSSETXATTR
>  
> +#define F2FS_IOC_SET_DONTMOVE_IO(F2FS_IOCTL_MAGIC, 13)
> +
>  struct f2fs_gc_range {
>   u32 sync;
>   u64 start;
> @@ -1129,6 +1131,9 @@ struct f2fs_sb_info {
>   /* threshold for converting bg victims for fg */
>   u64 fggc_threshold;
>  
> + /* threshold for dontmove gc trials */
> + u64 gc_dontmove;
> +
>   /* maximum # of trials to find a victim segment for SSR and GC */
>   unsigned int max_victim_search;
>  
> @@ -2104,6 +2109,7 @@ enum {
>   FI_HOT_DATA,/* indicate file is hot */
>   FI_EXTRA_ATTR,  /* indicate file has extra attribute */
>   FI_PROJ_INHERIT,/* indicate file inherits projectid */
> + FI_DONTMOVE,/* indicate file should not be gced */
>  };
>  
>  static inline void __mark_inode_dirty_flag(struct inode *inode,
> @@ -2117,6 +2123,7 @@ static inline void __mark_inode_dirty_flag(struct 
> inode *inode,
>   return;
>   case FI_DATA_EXIST:
>   case FI_INLINE_DOTS:
> + case FI_DONTMOVE:
>   f2fs_mark_inode_dirty_sync(inode, true);
>   }
>  }
> @@ -2225,6 +2232,8 @@ static inline void get_inline_info(struct inode 
> *inode, struct f2fs_inode *ri)
>   set_bit(FI_INLINE_DOTS, >flags);
>   if (ri->i_inline & F2FS_EXTRA_ATTR)
>   set_bit(FI_EXTRA_ATTR, >flags);
> + if (ri->i_inline & F2FS_DONTMOVE)
> + set_bit(FI_DONTMOVE, >flags);
>  }
>  
>  static inline void set_raw_inline(struct inode *inode, struct f2fs_inode 
> *ri)
> @@ -2243,6 +2252,8 @@ static inline void set_raw_inline(struct inode 
> *inode, struct f2fs_inode *ri)
>   ri->i_inline |= F2FS_INLINE_DOTS;
>   if (is_inode_flag_set(inode, FI_EXTRA_ATTR))
>   ri->i_inline |= F2FS_EXTRA_ATTR;
> + if (is_inode_flag_set(inode, FI_DONTMOVE))
> + ri->i_inline |= F2FS_DONTMOVE;
>  }
>  
>  static inline int f2fs_has_extra_attr(struct inode *inode)
> @@ -2288,6 +2299,11 @@ static inline int f2fs_has_inline_dots(struct 
> inode *inode)
>   return is_inode_flag_set(inode, FI_INLINE_DOTS);
>  }
>  
> +static inline bool f2fs_is_dontmove_file(struct inode *inode)
> +{
> + return is_inode_flag_set(inode, FI_DONTMOVE);
> +}
> +
>  static inline bool f2fs_is_atomic_file(struct inode *inode)
>  {
>   return is_inode_flag_set(inode, FI_ATOMIC_FILE);
> @@ -2503,6 +2519,7 @@ int truncate_hole(struct inode *inode, pgoff_t 
> pg_start, pgoff_t pg_end);
>  void truncate_data_blocks_range(struct dnode_of_data *dn, int count);
>  long f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
>

Re: TPM driver breaks S3 suspend

2017-12-24 Thread Chris Chiu

On Mon, Dec 25, 2017 at 4:33 AM, Jarkko Sakkinen
 wrote:
> On Thu, Dec 21, 2017 at 04:04:56PM +0800, Chris Chiu wrote:
>> Hi,
>> We have a desktop which has S3 suspend (to RAM) problem due to
>> error messages as follows.
>> [  198.908282] tpm tpm0: Error (38) sending savestate before suspend
>> [  198.908289] __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x160 returns 38
>> [  198.908293] dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns 38
>> [  198.908298] PM: Device 00:0b failed to suspend: error 38
>>
>> However, the first suspend after boot is working although it still
>> shows an interesting message during resume.
>> [  155.789945] tpm tpm0: A TPM error (38) occurred continue selftest
>>
>> The error code 38 in definition is TPM_ERR_INVALID_POSTINIT. I
>> found some explanations which said this error code means that this
>> command was received in the wrong sequence relative to a TPM_Startup
>> command. Don't really know what happens here and how should I deal
>> with this? Any suggestions? Please let me know what else information
>> should I provide. Thanks
>>
>> Chris
>
> The sequences for initializing TPM 1.x devices has been fairly static
> for a long time. Has this occured after a kernel update? Is there a
> kernel version where it used to work and a version where it doesn't?
> Thanks.
>
> /Jarkko
Hi Jarkko,

Actually, it's a new Acer machine which I never tried an older kernel. I only
tried versions >= 4.13. The output of "lsmod | grep tpm" is none. I think it's
not built as a module.

Chris

Re: TPM driver breaks S3 suspend

2017-12-24 Thread Chris Chiu

On Mon, Dec 25, 2017 at 4:33 AM, Jarkko Sakkinen
 wrote:
> On Thu, Dec 21, 2017 at 04:04:56PM +0800, Chris Chiu wrote:
>> Hi,
>> We have a desktop which has S3 suspend (to RAM) problem due to
>> error messages as follows.
>> [  198.908282] tpm tpm0: Error (38) sending savestate before suspend
>> [  198.908289] __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x160 returns 38
>> [  198.908293] dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns 38
>> [  198.908298] PM: Device 00:0b failed to suspend: error 38
>>
>> However, the first suspend after boot is working although it still
>> shows an interesting message during resume.
>> [  155.789945] tpm tpm0: A TPM error (38) occurred continue selftest
>>
>> The error code 38 in definition is TPM_ERR_INVALID_POSTINIT. I
>> found some explanations which said this error code means that this
>> command was received in the wrong sequence relative to a TPM_Startup
>> command. Don't really know what happens here and how should I deal
>> with this? Any suggestions? Please let me know what else information
>> should I provide. Thanks
>>
>> Chris
>
> The sequences for initializing TPM 1.x devices has been fairly static
> for a long time. Has this occured after a kernel update? Is there a
> kernel version where it used to work and a version where it doesn't?
> Thanks.
>
> /Jarkko
Hi Jarkko,

Actually, it's a new Acer machine which I never tried an older kernel. I only
tried versions >= 4.13. The output of "lsmod | grep tpm" is none. I think it's
not built as a module.

Chris

[PATCH v2 1/3] tpm: delete the TPM_TIS_CLK_ENABLE flag

2017-12-24 Thread Javier Martinez Canillas

This flag is only used to warn if CLKRUN_EN wasn't disabled on Braswell
systems, but the only way this can happen is if the code is not correct.

So it's an unnecessary check that just makes the code harder to read.

Suggested-by: Jarkko Sakkinen 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
---

 drivers/char/tpm/tpm_tis.c  | 15 ---
 drivers/char/tpm/tpm_tis_core.c |  2 --
 drivers/char/tpm/tpm_tis_core.h |  1 -
 3 files changed, 18 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index c847fc69a2fc..4b73e28458e3 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -138,9 +138,6 @@ static int tpm_tcg_read_bytes(struct tpm_tis_data *data, 
u32 addr, u16 len,
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
while (len--)
*result++ = ioread8(phy->iobase + addr);
 
@@ -152,9 +149,6 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, 
u32 addr, u16 len,
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
while (len--)
iowrite8(*value++, phy->iobase + addr);
 
@@ -165,9 +159,6 @@ static int tpm_tcg_read16(struct tpm_tis_data *data, u32 
addr, u16 *result)
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
*result = ioread16(phy->iobase + addr);
 
return 0;
@@ -177,9 +168,6 @@ static int tpm_tcg_read32(struct tpm_tis_data *data, u32 
addr, u32 *result)
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
*result = ioread32(phy->iobase + addr);
 
return 0;
@@ -189,9 +177,6 @@ static int tpm_tcg_write32(struct tpm_tis_data *data, u32 
addr, u32 value)
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
iowrite32(value, phy->iobase + addr);
 
return 0;
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index dc3600fc79b7..f2bd99fa8352 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -751,7 +751,6 @@ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, 
bool value)
return;
 
if (value) {
-   data->flags |= TPM_TIS_CLK_ENABLE;
data->clkrun_enabled++;
if (data->clkrun_enabled > 1)
return;
@@ -782,7 +781,6 @@ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, 
bool value)
 * sure LPC clock is running before sending any TPM command.
 */
outb(0xCC, 0x80);
-   data->flags &= ~TPM_TIS_CLK_ENABLE;
}
 }
 
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index afc50cde1ba6..d5c6a2e952b3 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -86,7 +86,6 @@ enum tis_defaults {
 
 enum tpm_tis_flags {
TPM_TIS_ITPM_WORKAROUND = BIT(0),
-   TPM_TIS_CLK_ENABLE  = BIT(1),
 };
 
 struct tpm_tis_data {
-- 
2.14.3

[PATCH v2 1/3] tpm: delete the TPM_TIS_CLK_ENABLE flag

2017-12-24 Thread Javier Martinez Canillas

This flag is only used to warn if CLKRUN_EN wasn't disabled on Braswell
systems, but the only way this can happen is if the code is not correct.

So it's an unnecessary check that just makes the code harder to read.

Suggested-by: Jarkko Sakkinen 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
---

 drivers/char/tpm/tpm_tis.c  | 15 ---
 drivers/char/tpm/tpm_tis_core.c |  2 --
 drivers/char/tpm/tpm_tis_core.h |  1 -
 3 files changed, 18 deletions(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index c847fc69a2fc..4b73e28458e3 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -138,9 +138,6 @@ static int tpm_tcg_read_bytes(struct tpm_tis_data *data, 
u32 addr, u16 len,
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
while (len--)
*result++ = ioread8(phy->iobase + addr);
 
@@ -152,9 +149,6 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, 
u32 addr, u16 len,
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
while (len--)
iowrite8(*value++, phy->iobase + addr);
 
@@ -165,9 +159,6 @@ static int tpm_tcg_read16(struct tpm_tis_data *data, u32 
addr, u16 *result)
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
*result = ioread16(phy->iobase + addr);
 
return 0;
@@ -177,9 +168,6 @@ static int tpm_tcg_read32(struct tpm_tis_data *data, u32 
addr, u32 *result)
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
*result = ioread32(phy->iobase + addr);
 
return 0;
@@ -189,9 +177,6 @@ static int tpm_tcg_write32(struct tpm_tis_data *data, u32 
addr, u32 value)
 {
struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data);
 
-   if (is_bsw() && !(data->flags & TPM_TIS_CLK_ENABLE))
-   WARN(1, "CLKRUN not enabled!\n");
-
iowrite32(value, phy->iobase + addr);
 
return 0;
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index dc3600fc79b7..f2bd99fa8352 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -751,7 +751,6 @@ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, 
bool value)
return;
 
if (value) {
-   data->flags |= TPM_TIS_CLK_ENABLE;
data->clkrun_enabled++;
if (data->clkrun_enabled > 1)
return;
@@ -782,7 +781,6 @@ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, 
bool value)
 * sure LPC clock is running before sending any TPM command.
 */
outb(0xCC, 0x80);
-   data->flags &= ~TPM_TIS_CLK_ENABLE;
}
 }
 
diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
index afc50cde1ba6..d5c6a2e952b3 100644
--- a/drivers/char/tpm/tpm_tis_core.h
+++ b/drivers/char/tpm/tpm_tis_core.h
@@ -86,7 +86,6 @@ enum tis_defaults {
 
 enum tpm_tis_flags {
TPM_TIS_ITPM_WORKAROUND = BIT(0),
-   TPM_TIS_CLK_ENABLE  = BIT(1),
 };
 
 struct tpm_tis_data {
-- 
2.14.3

[PATCH v2 0/3] tpm: fix PS/2 devices not working on Braswell systems due CLKRUN enabled

2017-12-24 Thread Javier Martinez Canillas

Hello,

Commit 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell systems")
added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
signal during TPM transactions.

Unfortunately this breaks other devices that are attached to the LPC bus
like for example PS/2 mouse and keyboards.

The bug was reported to the Fedora kernel [0] and the kernel bugzilla [1].
This issue and the propossed solution were discussed in this [2] thread,
and the reporter (Jame Ettle) confirmed that his system works again after
the fix in this series.

The patches are based on top or Jarkko Sakkinen's linux-tpmdd [3] tree.

Changes since v1:
- Add collected tags
- Drop patch that fixed a bug in the error path since was already fixed.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1498987
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=197287
[2]: https://patchwork.kernel.org/patch/10119417/
[3]: git.infradead.org/users/jjs/linux-tpmdd.git

Best regards,
Javier


Javier Martinez Canillas (3):
  tpm: delete the TPM_TIS_CLK_ENABLE flag
  tpm: follow coding style for variable declaration in
tpm_tis_core_init()
  tpm: only attempt to disable the LPC CLKRUN if is already enabled

 drivers/char/tpm/tpm_tis.c  | 15 ---
 drivers/char/tpm/tpm_tis_core.c | 17 +
 drivers/char/tpm/tpm_tis_core.h |  1 -
 3 files changed, 13 insertions(+), 20 deletions(-)

-- 
2.14.3

[PATCH v2 0/3] tpm: fix PS/2 devices not working on Braswell systems due CLKRUN enabled

2017-12-24 Thread Javier Martinez Canillas

Hello,

Commit 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell systems")
added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
signal during TPM transactions.

Unfortunately this breaks other devices that are attached to the LPC bus
like for example PS/2 mouse and keyboards.

The bug was reported to the Fedora kernel [0] and the kernel bugzilla [1].
This issue and the propossed solution were discussed in this [2] thread,
and the reporter (Jame Ettle) confirmed that his system works again after
the fix in this series.

The patches are based on top or Jarkko Sakkinen's linux-tpmdd [3] tree.

Changes since v1:
- Add collected tags
- Drop patch that fixed a bug in the error path since was already fixed.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1498987
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=197287
[2]: https://patchwork.kernel.org/patch/10119417/
[3]: git.infradead.org/users/jjs/linux-tpmdd.git

Best regards,
Javier


Javier Martinez Canillas (3):
  tpm: delete the TPM_TIS_CLK_ENABLE flag
  tpm: follow coding style for variable declaration in
tpm_tis_core_init()
  tpm: only attempt to disable the LPC CLKRUN if is already enabled

 drivers/char/tpm/tpm_tis.c  | 15 ---
 drivers/char/tpm/tpm_tis_core.c | 17 +
 drivers/char/tpm/tpm_tis_core.h |  1 -
 3 files changed, 13 insertions(+), 20 deletions(-)

-- 
2.14.3

[PATCH v2 2/3] tpm: follow coding style for variable declaration in tpm_tis_core_init()

2017-12-24 Thread Javier Martinez Canillas

The coding style says "use just one data declaration per line (no commas
for multiple data declarations)" so follow this convention.

Suggested-by: Jarkko Sakkinen 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
---

 drivers/char/tpm/tpm_tis_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index f2bd99fa8352..03daf7017e0f 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -803,7 +803,9 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
  const struct tpm_tis_phy_ops *phy_ops,
  acpi_handle acpi_dev_handle)
 {
-   u32 vendor, intfcaps, intmask;
+   u32 vendor;
+   u32 intfcaps;
+   u32 intmask;
u8 rid;
int rc, probe;
struct tpm_chip *chip;
-- 
2.14.3

[PATCH v2 2/3] tpm: follow coding style for variable declaration in tpm_tis_core_init()

2017-12-24 Thread Javier Martinez Canillas

The coding style says "use just one data declaration per line (no commas
for multiple data declarations)" so follow this convention.

Suggested-by: Jarkko Sakkinen 
Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 
---

 drivers/char/tpm/tpm_tis_core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index f2bd99fa8352..03daf7017e0f 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -803,7 +803,9 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
  const struct tpm_tis_phy_ops *phy_ops,
  acpi_handle acpi_dev_handle)
 {
-   u32 vendor, intfcaps, intmask;
+   u32 vendor;
+   u32 intfcaps;
+   u32 intmask;
u8 rid;
int rc, probe;
struct tpm_chip *chip;
-- 
2.14.3

[PATCH v2 3/3] tpm: only attempt to disable the LPC CLKRUN if is already enabled

2017-12-24 Thread Javier Martinez Canillas

Commit 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell systems")
added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
signal during TPM transactions.

Unfortunately this breaks other devices that are attached to the LPC bus
like for example PS/2 mouse and keyboards.

One flaw with the logic is that it assumes that the CLKRUN is always
enabled, and so it unconditionally enables it after a TPM transaction.

But it could be that the CLKRUN# signal was already disabled in the LPC
bus and so after the driver probes, CLKRUN_EN will remain enabled which
may break other devices that are attached to the LPC bus but don't have
support for the CLKRUN protocol.

Fixes: 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell systems")
Signed-off-by: Javier Martinez Canillas 
Tested-by: James Ettle 
Tested-by: Jeffery Miller 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 

---

This patch fixes the bug reported for the Fedora kernel [0] and the kernel
bugzilla [1]. The issue and the propossed solution were discussed in this
[2] thread.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1498987
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=197287
[2]: https://patchwork.kernel.org/patch/10119417/


 drivers/char/tpm/tpm_tis_core.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 03daf7017e0f..a72a9f03286d 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -747,7 +747,8 @@ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, 
bool value)
struct tpm_tis_data *data = dev_get_drvdata(>dev);
u32 clkrun_val;
 
-   if (!IS_ENABLED(CONFIG_X86) || !is_bsw())
+   if (!IS_ENABLED(CONFIG_X86) || !is_bsw() ||
+   !data->ilb_base_addr)
return;
 
if (value) {
@@ -806,6 +807,7 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
u32 vendor;
u32 intfcaps;
u32 intmask;
+   u32 clkrun_val;
u8 rid;
int rc, probe;
struct tpm_chip *chip;
@@ -831,6 +833,13 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
ILB_REMAP_SIZE);
if (!priv->ilb_base_addr)
return -ENOMEM;
+
+   clkrun_val = ioread32(priv->ilb_base_addr + LPC_CNTRL_OFFSET);
+   /* Check if CLKRUN# is already not enabled in the LPC bus */
+   if (!(clkrun_val & LPC_CLKRUN_EN)) {
+   iounmap(priv->ilb_base_addr);
+   priv->ilb_base_addr = NULL;
+   }
}
 
if (chip->ops->clk_enable != NULL)
-- 
2.14.3

[PATCH v2 3/3] tpm: only attempt to disable the LPC CLKRUN if is already enabled

2017-12-24 Thread Javier Martinez Canillas

Commit 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell systems")
added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
signal during TPM transactions.

Unfortunately this breaks other devices that are attached to the LPC bus
like for example PS/2 mouse and keyboards.

One flaw with the logic is that it assumes that the CLKRUN is always
enabled, and so it unconditionally enables it after a TPM transaction.

But it could be that the CLKRUN# signal was already disabled in the LPC
bus and so after the driver probes, CLKRUN_EN will remain enabled which
may break other devices that are attached to the LPC bus but don't have
support for the CLKRUN protocol.

Fixes: 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell systems")
Signed-off-by: Javier Martinez Canillas 
Tested-by: James Ettle 
Tested-by: Jeffery Miller 
Reviewed-by: Jarkko Sakkinen 
Tested-by: Jarkko Sakkinen 

---

This patch fixes the bug reported for the Fedora kernel [0] and the kernel
bugzilla [1]. The issue and the propossed solution were discussed in this
[2] thread.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1498987
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=197287
[2]: https://patchwork.kernel.org/patch/10119417/


 drivers/char/tpm/tpm_tis_core.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 03daf7017e0f..a72a9f03286d 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -747,7 +747,8 @@ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, 
bool value)
struct tpm_tis_data *data = dev_get_drvdata(>dev);
u32 clkrun_val;
 
-   if (!IS_ENABLED(CONFIG_X86) || !is_bsw())
+   if (!IS_ENABLED(CONFIG_X86) || !is_bsw() ||
+   !data->ilb_base_addr)
return;
 
if (value) {
@@ -806,6 +807,7 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
u32 vendor;
u32 intfcaps;
u32 intmask;
+   u32 clkrun_val;
u8 rid;
int rc, probe;
struct tpm_chip *chip;
@@ -831,6 +833,13 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
ILB_REMAP_SIZE);
if (!priv->ilb_base_addr)
return -ENOMEM;
+
+   clkrun_val = ioread32(priv->ilb_base_addr + LPC_CNTRL_OFFSET);
+   /* Check if CLKRUN# is already not enabled in the LPC bus */
+   if (!(clkrun_val & LPC_CLKRUN_EN)) {
+   iounmap(priv->ilb_base_addr);
+   priv->ilb_base_addr = NULL;
+   }
}
 
if (chip->ops->clk_enable != NULL)
-- 
2.14.3

Re: [PATCH v2] arm64: dts: Hi3660: Fix up psci state id

2017-12-24 Thread Leo Yan

Hi Vincent,

[ + John, Kevin Wang ]

On Fri, Dec 22, 2017 at 03:22:51PM +0100, Vincent Guittot wrote:
> Hi Leo,
> 
> Sorry for jumping late in the discussion but should  we also remove
> the NAP state from the property cpu-idle-states of the CPUs because
> this state not supported by the platform at least for now and may be
> not in a near future ?

Thanks for bringing up this.

I don't want to hide anything for patch discussion :) this patch is to
resolve the PSCI parameter mismatching issue between kernel and ARM-TF
and it's not used to resolve the bug for CPU_NAP, so I didn't mention
the CPU_NAP malfunction issue to avoid complex discussion context.

I want to keep CPU_NAP state and track bug for CPU_NAP fixing; if we
remove this state, I suspect we might have no chance to enable it
anymore. Finally this is up to Hisilicon colleague decision and if they
have time to fix this.

I will offline to check with Daniel and Kevin for this; and if we
finally decide to remove it we can commit extra patch for this later,
how about you think?

> Then, I have another question regarding the update of the
> psci-suspend-parameter. These changes implies an update of the psci
> firmawre which means that we will now have 2 different firmware
> version compatible with 2 different dt.
> 
> Is there any way to check that the ATF on the board is the one that
> compatible with the parameter with something like a version ? I
> currently use the previous firmware which works fine with current
> kernel and dt binding once the NAP state is removed from the table.
> When moving on recent kernel, I will have to take care of updating the
> firmware and if i need to go back on a previous kernel, i will have to
> make sure that i have the right ATF version. This make a lot of chance
> of having the wrong configuration

AFAIK, we cannot distinguish the PSCI parameter by PSCI version or
ARM-TF version number; alternatively one simple way for checking ARM-TF
is we can get commit ID (e.g. 83df7ce) from the ARM-TF log; so any
ARM-TF commit ID is newer than the patch fdae60b6ba27: "Hikey960:
Change to use recommended power state id format" should apply this
kernel patch.

NOTICE:  BL1: Booting BL31
NOTICE:  BL31: v1.4(debug):v1.4-441-g83df7ce-dirty
NOTICE:  BL31: Built : 17:31:35, Dec 22 2017

BTW, I hope we can upgrade Linux kernel and ARM-TF to latest code base
to avoid compatible issue; for Android offical releasing it uses the
old PSCI parameters with Hisilicon legacy booting images, so they can
work well, but if someone uses ARM-TF mainline code + Android kernel
4.4/4.9, there must have compatible issue.

I am monitoring the integration ARM-TF/UEFI into Android on Hikey960,
we need backport this patch onto Android kernel 4.4/4.9 ASAP after
integration ARM-TF/UEFI.

Thanks,
Leo Yan

> Regards,
> Vincent
> 
> On 12 December 2017 at 10:12, Leo Yan  wrote:
> > Thanks a lot for Vincent Guittot careful work to find bug for 'CPU_NAP'
> > idle state.  From ftrace log we can observe CA73 CPUs can be easily
> > waken up from 'CPU_NAP' state but the 'waken up' CPUs doesn't handle
> > anything and sleep again; so there have tons of trace events for CA73
> > CPUs entering and exiting idle state.
> >
> > On Hi3660 CA73 has retention state 'CPU_NAP' for CPU idle, this state we
> > set its psci parameter as '0x001' and from this parameter it can
> > calculate state id is 1.  Unfortunately ARM trusted firmware (ARM-TF)
> > takes 1 as a invalid value for state id, so the CPU cannot enter idle
> > state and directly bail out to kernel.
> >
> > We want to create good practice for psci parameters platform definition,
> > so review the psci specification. The spec "ARM Power State Coordination
> > Interface - Platform Design Document (ARM DEN 0022D)" recommends state
> > ID in chapter "6.5 Recommended StateID Encoding".  The recommended power
> > state IDs can be presented by below listed values; and it divides into
> > three fields, every field can use 4 bits to present power states
> > corresponding to core level, cluster level and system level:
> >   0: Run
> >   1: Standby
> >   2: Retention
> >   3: Powerdown
> >
> > This commit changes psci parameter to compliance with the suggested
> > state ID in the doc.  Except we change 'CPU_NAP' state psci parameter
> > to '0x002', this commit also changes 'CPU_SLEEP' and 'CLUSTER_SLEEP'
> > state parameters to '0x0010003' and '0x1010033' respectively.
> >
> > Credits to Daniel, Sudeep and Soby for suggestion and consolidation.
> >
> > Cc: Vincent Guittot 
> > Cc: Daniel Lezcano 
> > Cc: Sudeep Holla 
> > Cc: Soby Mathew 
> > Signed-off-by: Leo Yan 
> > ---
> >  arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
> >

Re: [PATCH v2] arm64: dts: Hi3660: Fix up psci state id

2017-12-24 Thread Leo Yan

Hi Vincent,

[ + John, Kevin Wang ]

On Fri, Dec 22, 2017 at 03:22:51PM +0100, Vincent Guittot wrote:
> Hi Leo,
> 
> Sorry for jumping late in the discussion but should  we also remove
> the NAP state from the property cpu-idle-states of the CPUs because
> this state not supported by the platform at least for now and may be
> not in a near future ?

Thanks for bringing up this.

I don't want to hide anything for patch discussion :) this patch is to
resolve the PSCI parameter mismatching issue between kernel and ARM-TF
and it's not used to resolve the bug for CPU_NAP, so I didn't mention
the CPU_NAP malfunction issue to avoid complex discussion context.

I want to keep CPU_NAP state and track bug for CPU_NAP fixing; if we
remove this state, I suspect we might have no chance to enable it
anymore. Finally this is up to Hisilicon colleague decision and if they
have time to fix this.

I will offline to check with Daniel and Kevin for this; and if we
finally decide to remove it we can commit extra patch for this later,
how about you think?

> Then, I have another question regarding the update of the
> psci-suspend-parameter. These changes implies an update of the psci
> firmawre which means that we will now have 2 different firmware
> version compatible with 2 different dt.
> 
> Is there any way to check that the ATF on the board is the one that
> compatible with the parameter with something like a version ? I
> currently use the previous firmware which works fine with current
> kernel and dt binding once the NAP state is removed from the table.
> When moving on recent kernel, I will have to take care of updating the
> firmware and if i need to go back on a previous kernel, i will have to
> make sure that i have the right ATF version. This make a lot of chance
> of having the wrong configuration

AFAIK, we cannot distinguish the PSCI parameter by PSCI version or
ARM-TF version number; alternatively one simple way for checking ARM-TF
is we can get commit ID (e.g. 83df7ce) from the ARM-TF log; so any
ARM-TF commit ID is newer than the patch fdae60b6ba27: "Hikey960:
Change to use recommended power state id format" should apply this
kernel patch.

NOTICE:  BL1: Booting BL31
NOTICE:  BL31: v1.4(debug):v1.4-441-g83df7ce-dirty
NOTICE:  BL31: Built : 17:31:35, Dec 22 2017

BTW, I hope we can upgrade Linux kernel and ARM-TF to latest code base
to avoid compatible issue; for Android offical releasing it uses the
old PSCI parameters with Hisilicon legacy booting images, so they can
work well, but if someone uses ARM-TF mainline code + Android kernel
4.4/4.9, there must have compatible issue.

I am monitoring the integration ARM-TF/UEFI into Android on Hikey960,
we need backport this patch onto Android kernel 4.4/4.9 ASAP after
integration ARM-TF/UEFI.

Thanks,
Leo Yan

> Regards,
> Vincent
> 
> On 12 December 2017 at 10:12, Leo Yan  wrote:
> > Thanks a lot for Vincent Guittot careful work to find bug for 'CPU_NAP'
> > idle state.  From ftrace log we can observe CA73 CPUs can be easily
> > waken up from 'CPU_NAP' state but the 'waken up' CPUs doesn't handle
> > anything and sleep again; so there have tons of trace events for CA73
> > CPUs entering and exiting idle state.
> >
> > On Hi3660 CA73 has retention state 'CPU_NAP' for CPU idle, this state we
> > set its psci parameter as '0x001' and from this parameter it can
> > calculate state id is 1.  Unfortunately ARM trusted firmware (ARM-TF)
> > takes 1 as a invalid value for state id, so the CPU cannot enter idle
> > state and directly bail out to kernel.
> >
> > We want to create good practice for psci parameters platform definition,
> > so review the psci specification. The spec "ARM Power State Coordination
> > Interface - Platform Design Document (ARM DEN 0022D)" recommends state
> > ID in chapter "6.5 Recommended StateID Encoding".  The recommended power
> > state IDs can be presented by below listed values; and it divides into
> > three fields, every field can use 4 bits to present power states
> > corresponding to core level, cluster level and system level:
> >   0: Run
> >   1: Standby
> >   2: Retention
> >   3: Powerdown
> >
> > This commit changes psci parameter to compliance with the suggested
> > state ID in the doc.  Except we change 'CPU_NAP' state psci parameter
> > to '0x002', this commit also changes 'CPU_SLEEP' and 'CLUSTER_SLEEP'
> > state parameters to '0x0010003' and '0x1010033' respectively.
> >
> > Credits to Daniel, Sudeep and Soby for suggestion and consolidation.
> >
> > Cc: Vincent Guittot 
> > Cc: Daniel Lezcano 
> > Cc: Sudeep Holla 
> > Cc: Soby Mathew 
> > Signed-off-by: Leo Yan 
> > ---
> >  arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi 
> > b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > index ab0b95b..99d5a46 100644
> > --- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
> > +++

[RFC] dmaengine: pl330: fix a race condition in case of threaded irqs

2017-12-24 Thread Qi Hou

I found this problem below, and I now understand why it happens, but I'm not
 100% sure what is the best way to fix it.

When booting up with "threadirqs" in command line, all irq handlers of the DMA
controller pl330 will be threaded forcedly. These threads will race for the same
list, pl330->req_done.

Before the callback, the spinlock was released. And after it, the spinlock was
taken. This opened an race window where another threaded irq handler could steal
the spinlock and be permitted to delete entries of the list, pl330->req_done.

If the later deleted an entry that was still referred to by the former, there 
would
be a kernel panic when the former was scheduled and tried to get the next 
sibling
of the deleted entry.

The scenario could be depicted as below:

  Thread: T1  pl330->req_done  Thread: T2
  | |  |
  |  -A-B-C-D- |
Locked  |  |
  | |   Waiting
Del A   |  |
  |  -B-C-D-   |
Unlocked|  |
  | |   Locked
Waiting |  |
  | |Del B
  | |  |
  |   -C-D- Unlocked
Waiting |  |
  |
Locked
  |
   get C via B
  \
   - Kernel panic

The kernel panic looked like as below:

Unable to handle kernel paging request at virtual address dead0108
pgd = ff8008c9e000
[dead0108] *pgd=00027fffe003, *pud=00027fffe003, 
*pmd=
Internal error: Oops: 9644 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 85 Comm: irq/59-6633 Not tainted 4.8.24-WR9.0.0.12_standard #2
Hardware name: Broadcom NS2 SVK (DT)
task: ffc1f5cc3c00 task.stack: ffc1f5ce
PC is at pl330_irq_handler+0x27c/0x390
LR is at pl330_irq_handler+0x2a8/0x390
pc : [] lr : [] pstate: 81c5
sp : ffc1f5ce3d00
x29: ffc1f5ce3d00 x28: 0140
x27: ffc1f5c530b0 x26: dead0100
x25: dead0200 x24: 00418958
x23: 0001 x22: ffc1f5ccd668
x21: ffc1f5ccd590 x20: ffc1f5ccd418
x19: dead0060 x18: 0001
x17: 0007 x16: 0001
x15:  x14: 
x13:  x12: 
x11: 0001 x10: 0840
x9 : ffc1f5ce x8 : ffc1f5cc3338
x7 : ff8008ce2020 x6 : 
x5 :  x4 : 0001
x3 : dead0200 x2 : dead0100
x1 : 0140 x0 : ffc1f5ccd590

Process irq/59-6633 (pid: 85, stack limit = 0xffc1f5ce0020)
Stack: (0xffc1f5ce3d00 to 0xffc1f5ce4000)
3d00: ffc1f5ce3d80 ff80080f09d0 ffc1f5ca0c00 ffc1f6f7c600
3d20: ffc1f5ce ffc1f6f7c600 ffc1f5ca0c00 ff80080f0998
3d40: ffc1f5ce ff80080f  
3d60: ff8008ce202c ff8008ce2020 ffc1f5ccd668 ffc1f5c530b0
3d80: ffc1f5ce3db0 ff80080f0d70 ffc1f5ca0c40 0001
3da0: ffc1f5ce ff80080f0cfc ffc1f5ce3e20 ff80080bf4f8
3dc0: ffc1f5ca0c80 ff8008bf3798 ff8008955528 ffc1f5ca0c00
3de0: ff80080f0c30   
3e00:    ff80080f0b68
3e20:  ff8008083690 ff80080bf420 ffc1f5ca0c80
3e40:    ff80080cb648
3e60: ff8008b1c780   ffc1f5ca0c00
3e80: ffc1 ff80 ffc1f5ce3e90 ffc1f5ce3e90
3ea0:  ff80 ffc1f5ce3eb0 ffc1f5ce3eb0
3ec0:    
3ee0:    
3f00:    
3f20:    
3f40:    
3f60:    
3f80:    
3fa0:    
3fc0:  0005  
3fe0:   000275ce3ff0 000275ce3ff8
Call trace:
Exception stack(0xffc1f5ce3b30 to 0xffc1f5ce3c60)
3b20:   dead0060 0080
3b40: ffc1f5ce3d00 ff80084cb694 0008 0e88
3b60: ffc1f5ce3bb0 ff80080dac68 ffc1f5ce3b90 ff8008826fe4
3b80: 01c0 01c0 ffc1f5ce3bb0 ff800848dfcc
3ba0: 0002 ff8008b15ae4 ffc1f5ce3c00 ff800808f000
3bc0: 0010

[RFC] dmaengine: pl330: fix a race condition in case of threaded irqs

2017-12-24 Thread Qi Hou

I found this problem below, and I now understand why it happens, but I'm not
 100% sure what is the best way to fix it.

When booting up with "threadirqs" in command line, all irq handlers of the DMA
controller pl330 will be threaded forcedly. These threads will race for the same
list, pl330->req_done.

Before the callback, the spinlock was released. And after it, the spinlock was
taken. This opened an race window where another threaded irq handler could steal
the spinlock and be permitted to delete entries of the list, pl330->req_done.

If the later deleted an entry that was still referred to by the former, there 
would
be a kernel panic when the former was scheduled and tried to get the next 
sibling
of the deleted entry.

The scenario could be depicted as below:

  Thread: T1  pl330->req_done  Thread: T2
  | |  |
  |  -A-B-C-D- |
Locked  |  |
  | |   Waiting
Del A   |  |
  |  -B-C-D-   |
Unlocked|  |
  | |   Locked
Waiting |  |
  | |Del B
  | |  |
  |   -C-D- Unlocked
Waiting |  |
  |
Locked
  |
   get C via B
  \
   - Kernel panic

The kernel panic looked like as below:

Unable to handle kernel paging request at virtual address dead0108
pgd = ff8008c9e000
[dead0108] *pgd=00027fffe003, *pud=00027fffe003, 
*pmd=
Internal error: Oops: 9644 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 85 Comm: irq/59-6633 Not tainted 4.8.24-WR9.0.0.12_standard #2
Hardware name: Broadcom NS2 SVK (DT)
task: ffc1f5cc3c00 task.stack: ffc1f5ce
PC is at pl330_irq_handler+0x27c/0x390
LR is at pl330_irq_handler+0x2a8/0x390
pc : [] lr : [] pstate: 81c5
sp : ffc1f5ce3d00
x29: ffc1f5ce3d00 x28: 0140
x27: ffc1f5c530b0 x26: dead0100
x25: dead0200 x24: 00418958
x23: 0001 x22: ffc1f5ccd668
x21: ffc1f5ccd590 x20: ffc1f5ccd418
x19: dead0060 x18: 0001
x17: 0007 x16: 0001
x15:  x14: 
x13:  x12: 
x11: 0001 x10: 0840
x9 : ffc1f5ce x8 : ffc1f5cc3338
x7 : ff8008ce2020 x6 : 
x5 :  x4 : 0001
x3 : dead0200 x2 : dead0100
x1 : 0140 x0 : ffc1f5ccd590

Process irq/59-6633 (pid: 85, stack limit = 0xffc1f5ce0020)
Stack: (0xffc1f5ce3d00 to 0xffc1f5ce4000)
3d00: ffc1f5ce3d80 ff80080f09d0 ffc1f5ca0c00 ffc1f6f7c600
3d20: ffc1f5ce ffc1f6f7c600 ffc1f5ca0c00 ff80080f0998
3d40: ffc1f5ce ff80080f  
3d60: ff8008ce202c ff8008ce2020 ffc1f5ccd668 ffc1f5c530b0
3d80: ffc1f5ce3db0 ff80080f0d70 ffc1f5ca0c40 0001
3da0: ffc1f5ce ff80080f0cfc ffc1f5ce3e20 ff80080bf4f8
3dc0: ffc1f5ca0c80 ff8008bf3798 ff8008955528 ffc1f5ca0c00
3de0: ff80080f0c30   
3e00:    ff80080f0b68
3e20:  ff8008083690 ff80080bf420 ffc1f5ca0c80
3e40:    ff80080cb648
3e60: ff8008b1c780   ffc1f5ca0c00
3e80: ffc1 ff80 ffc1f5ce3e90 ffc1f5ce3e90
3ea0:  ff80 ffc1f5ce3eb0 ffc1f5ce3eb0
3ec0:    
3ee0:    
3f00:    
3f20:    
3f40:    
3f60:    
3f80:    
3fa0:    
3fc0:  0005  
3fe0:   000275ce3ff0 000275ce3ff8
Call trace:
Exception stack(0xffc1f5ce3b30 to 0xffc1f5ce3c60)
3b20:   dead0060 0080
3b40: ffc1f5ce3d00 ff80084cb694 0008 0e88
3b60: ffc1f5ce3bb0 ff80080dac68 ffc1f5ce3b90 ff8008826fe4
3b80: 01c0 01c0 ffc1f5ce3bb0 ff800848dfcc
3ba0: 0002 ff8008b15ae4 ffc1f5ce3c00 ff800808f000
3bc0: 0010

Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT

2017-12-24 Thread Liubo(OS Lab)

On 2017/12/23 6:31, Ross Zwisler wrote:
> On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote:
>> On 12/14/2017 07:40 AM, Ross Zwisler wrote:
> <>
>>> We solve this issue by providing userspace with performance information on
>>> individual memory ranges.  This performance information is exposed via
>>> sysfs:
>>>
>>>   # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null
>>>   mem_tgt2/firmware_id:1
>>>   mem_tgt2/is_cached:0
>>>   mem_tgt2/local_init/read_bw_MBps:40960
>>>   mem_tgt2/local_init/read_lat_nsec:50
>>>   mem_tgt2/local_init/write_bw_MBps:40960
>>>   mem_tgt2/local_init/write_lat_nsec:50
> <>
>> We will enlist properties for all possible "source --> target" on the system?
> 
> Nope, just 'local' initiator/target pairs.  I talk about the reasoning for
> this in the cover letter for patch 3:
> 
> https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html
> 
>> Right now it shows only bandwidth and latency properties, can it accommodate
>> other properties as well in future ?
> 
> We also have an 'is_cached' attribute for the memory targets if they are
> involved in a caching hierarchy, but right now those are all the things we
> expose.  We can potentially expose whatever we want that is present in the
> HMAT, but those seemed like a good start.
> 
> I noticed that in your presentation you had some other examples of attributes
> you cared about:
> 
>  * reliability
>  * power consumption
>  * density
> 
> The HMAT doesn't provide this sort of information at present, but we
> could/would add them to sysfs if the HMAT ever grew support for them.
> 
>>> This allows applications to easily find the memory that they want to use.
>>> We expect that the existing NUMA APIs will be enhanced to use this new
>>> information so that applications can continue to use them to select their
>>> desired memory.
>>
>> I had presented a proposal for NUMA redesign in the Plumbers Conference this
>> year where various memory devices with different kind of memory attributes
>> can be represented in the kernel and be used explicitly from the user space.
>> Here is the link to the proposal if you feel interested. The proposal is
>> very intrusive and also I dont have a RFC for it yet for discussion here.
>>
>> https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf
>>
>> Problem is, designing the sysfs interface for memory attribute detection
>> from user space without first thinking about redesigning the NUMA for
>> heterogeneous memory may not be a good idea. Will look into this further.
> 
> I took another look at your presentation, and overall I think that if/when a
> NUMA redesign like this takes place ACPI systems with HMAT tables will be able
> to participate.  But I think we are probably a ways away from that, and like I

I'm afraid not, there are cache-coherent bus like CCIX/OpenCAPI come out soon.
No matter to say System-on-Chip already with internal bus linked 
DDR、HBM、CPU、Accelerator..

> said in my previous mail ACPI systems with memory-only NUMA nodes are going to
> exist and need to be supported with the current NUMA scheme.  Hence I don't

And not only memory-only, but the accelerators can also be a master like CPU.

> think that this patch series conflicts with your proposal.

Didn't see conflict neither, but perhaps we should think for a longer-term 
solution and cover more
situations/platforms.
Anshuman's proposal is really a good start point to us.

Cheers,
Bob Liu

Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT

2017-12-24 Thread Liubo(OS Lab)

On 2017/12/23 6:31, Ross Zwisler wrote:
> On Fri, Dec 22, 2017 at 08:39:41AM +0530, Anshuman Khandual wrote:
>> On 12/14/2017 07:40 AM, Ross Zwisler wrote:
> <>
>>> We solve this issue by providing userspace with performance information on
>>> individual memory ranges.  This performance information is exposed via
>>> sysfs:
>>>
>>>   # grep . mem_tgt2/* mem_tgt2/local_init/* 2>/dev/null
>>>   mem_tgt2/firmware_id:1
>>>   mem_tgt2/is_cached:0
>>>   mem_tgt2/local_init/read_bw_MBps:40960
>>>   mem_tgt2/local_init/read_lat_nsec:50
>>>   mem_tgt2/local_init/write_bw_MBps:40960
>>>   mem_tgt2/local_init/write_lat_nsec:50
> <>
>> We will enlist properties for all possible "source --> target" on the system?
> 
> Nope, just 'local' initiator/target pairs.  I talk about the reasoning for
> this in the cover letter for patch 3:
> 
> https://lists.01.org/pipermail/linux-nvdimm/2017-December/013574.html
> 
>> Right now it shows only bandwidth and latency properties, can it accommodate
>> other properties as well in future ?
> 
> We also have an 'is_cached' attribute for the memory targets if they are
> involved in a caching hierarchy, but right now those are all the things we
> expose.  We can potentially expose whatever we want that is present in the
> HMAT, but those seemed like a good start.
> 
> I noticed that in your presentation you had some other examples of attributes
> you cared about:
> 
>  * reliability
>  * power consumption
>  * density
> 
> The HMAT doesn't provide this sort of information at present, but we
> could/would add them to sysfs if the HMAT ever grew support for them.
> 
>>> This allows applications to easily find the memory that they want to use.
>>> We expect that the existing NUMA APIs will be enhanced to use this new
>>> information so that applications can continue to use them to select their
>>> desired memory.
>>
>> I had presented a proposal for NUMA redesign in the Plumbers Conference this
>> year where various memory devices with different kind of memory attributes
>> can be represented in the kernel and be used explicitly from the user space.
>> Here is the link to the proposal if you feel interested. The proposal is
>> very intrusive and also I dont have a RFC for it yet for discussion here.
>>
>> https://linuxplumbersconf.org/2017/ocw//system/presentations/4656/original/Hierarchical_NUMA_Design_Plumbers_2017.pdf
>>
>> Problem is, designing the sysfs interface for memory attribute detection
>> from user space without first thinking about redesigning the NUMA for
>> heterogeneous memory may not be a good idea. Will look into this further.
> 
> I took another look at your presentation, and overall I think that if/when a
> NUMA redesign like this takes place ACPI systems with HMAT tables will be able
> to participate.  But I think we are probably a ways away from that, and like I

I'm afraid not, there are cache-coherent bus like CCIX/OpenCAPI come out soon.
No matter to say System-on-Chip already with internal bus linked 
DDR、HBM、CPU、Accelerator..

> said in my previous mail ACPI systems with memory-only NUMA nodes are going to
> exist and need to be supported with the current NUMA scheme.  Hence I don't

And not only memory-only, but the accelerators can also be a master like CPU.

> think that this patch series conflicts with your proposal.

Didn't see conflict neither, but perhaps we should think for a longer-term 
solution and cover more
situations/platforms.
Anshuman's proposal is really a good start point to us.

Cheers,
Bob Liu

Re: [PATCH v3 27/27] devres: kill devm_ioremap_nocache

2017-12-24 Thread Yisheng Xie



On 2017/12/23 21:45, Greg KH wrote:
> On Sat, Dec 23, 2017 at 07:02:59PM +0800, Yisheng Xie wrote:
>> --- a/lib/devres.c
>> +++ b/lib/devres.c
>> @@ -44,35 +44,6 @@ void __iomem *devm_ioremap(struct device *dev, 
>> resource_size_t offset,
>>  EXPORT_SYMBOL(devm_ioremap);
>>  
>>  /**
>> - * devm_ioremap_nocache - Managed ioremap_nocache()
>> - * @dev: Generic device to remap IO address for
>> - * @offset: Resource address to map
>> - * @size: Size of map
>> - *
>> - * Managed ioremap_nocache().  Map is automatically unmapped on driver
>> - * detach.
>> - */
>> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
>> offset,
>> -   resource_size_t size)
>> -{
>> -void __iomem **ptr, *addr;
>> -
>> -ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
>> -if (!ptr)
>> -return NULL;
>> -
>> -addr = ioremap_nocache(offset, size);
> 
> Wait, devm_ioremap() calls ioremap(), not ioremap_nocache(), are you
> _SURE_ that these are all identical?  For all arches?  If so, then
> ioremap_nocache() can also be removed, right?

Yeah, As Christophe pointed out, that 4 archs do not have the same function.
But I do not why they do not want do the same thing. Driver may no know about
this? right?

> 
> In my quick glance, I don't think you can do this series at all :(

Yes, maybe should take Christophe suggestion and use a bool or enum to 
distinguish them?

Thanks
Yisheng
> 
> greg k-h
> 
> .
>

Re: [PATCH v3 27/27] devres: kill devm_ioremap_nocache

2017-12-24 Thread Yisheng Xie



On 2017/12/23 21:45, Greg KH wrote:
> On Sat, Dec 23, 2017 at 07:02:59PM +0800, Yisheng Xie wrote:
>> --- a/lib/devres.c
>> +++ b/lib/devres.c
>> @@ -44,35 +44,6 @@ void __iomem *devm_ioremap(struct device *dev, 
>> resource_size_t offset,
>>  EXPORT_SYMBOL(devm_ioremap);
>>  
>>  /**
>> - * devm_ioremap_nocache - Managed ioremap_nocache()
>> - * @dev: Generic device to remap IO address for
>> - * @offset: Resource address to map
>> - * @size: Size of map
>> - *
>> - * Managed ioremap_nocache().  Map is automatically unmapped on driver
>> - * detach.
>> - */
>> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
>> offset,
>> -   resource_size_t size)
>> -{
>> -void __iomem **ptr, *addr;
>> -
>> -ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
>> -if (!ptr)
>> -return NULL;
>> -
>> -addr = ioremap_nocache(offset, size);
> 
> Wait, devm_ioremap() calls ioremap(), not ioremap_nocache(), are you
> _SURE_ that these are all identical?  For all arches?  If so, then
> ioremap_nocache() can also be removed, right?

Yeah, As Christophe pointed out, that 4 archs do not have the same function.
But I do not why they do not want do the same thing. Driver may no know about
this? right?

> 
> In my quick glance, I don't think you can do this series at all :(

Yes, maybe should take Christophe suggestion and use a bool or enum to 
distinguish them?

Thanks
Yisheng
> 
> greg k-h
> 
> .
>

1 2 3 >

1 - 100 of 282 matches

Mail list logo