date:20160525

Re: [Patch v4 0/9] * Fix kdump failure in system with amd iommu*

2016-05-25 Thread Baoquan He

Hi Joerg,

Attachments are console log of normal kernel and kdump kernel on a test
machine at my hand, and the related information of lspci -vt and lspci
-vvv. Before I tried to defer the calling of set_dte_entry() until the
device driver try to call __map_single() to really allocate coherent
memory or do the mapping, seems it didn't work. I am surprised by the
simplicity and effectiveness of Intel IOMMU fix, but can't think of
where I have missed. 




-[:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Root Complex
   +-00.2  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) I/O Memory Management Unit
   +-01.0  Advanced Micro Devices, Inc. [AMD/ATI] Kaveri [Radeon R7 
Graphics]
   +-01.1  Advanced Micro Devices, Inc. [AMD/ATI] Kaveri HDMI/DP Audio 
Controller
   +-02.0  Advanced Micro Devices, Inc. [AMD] Device 1424
   +-03.0  Advanced Micro Devices, Inc. [AMD] Device 1424
   +-03.1-[01]00.0  Realtek Semiconductor Co., Ltd. 
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
   +-04.0  Advanced Micro Devices, Inc. [AMD] Device 1424
   +-10.0  Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller
   +-10.1  Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller
   +-11.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [IDE 
mode]
   +-12.0  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
   +-12.2  Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller
   +-13.0  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
   +-13.2  Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller
   +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
   +-14.1  Advanced Micro Devices, Inc. [AMD] FCH IDE Controller
   +-14.2  Advanced Micro Devices, Inc. [AMD] FCH Azalia Controller
   +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
   +-14.4-[02]--
   +-14.5  Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller
   +-18.0  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Function 0
   +-18.1  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Function 1
   +-18.2  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Function 2
   +-18.3  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Function 3
   +-18.4  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Function 4
   \-18.5  Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Function 5
[0.00] Linux version 4.6.0-rc7+ (b...@dhcp-129-10.nay.redhat.com) (gcc 
version 5.1.1 20150618 (Red Hat 5.1.1-4) (GCC) ) #34 SMP Tue May 24 17:6
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.6.0-rc7+ 
root=/dev/mapper/fedora_dhcp--129--10-root ro rd.lvm.lv=fedora_dhcp-129-10/root 
rd.lvm.lvl
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, 
using 'standard' format.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x7cc99fff] usable
[0.00] BIOS-e820: [mem 0x7cc9a000-0x7ccc9fff] reserved
[0.00] BIOS-e820: [mem 0x7ccca000-0x7cf9] usable
[0.00] BIOS-e820: [mem 0x7cfa-0x7d06dfff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7d06e000-0x7e1c7fff] reserved
[0.00] BIOS-e820: [mem 0x7e1c8000-0x7e1c8fff] usable
[0.00] BIOS-e820: [mem 0x7e1c9000-0x7e3cefff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7e3cf000-0x7e850fff] usable
[0.00] BIOS-e820: [mem 0x7e851000-0x7efe1fff] reserved
[0.00] BIOS-e820: [mem 0x7efe2000-0x7eff] usable
[0.00] BIOS-e820: [mem 0xfec0-0xfec01fff] reserved
[0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved
[0.00] BIOS-e820: [mem 0xfed4-0xfed44fff] reserved
[0.00] BIOS-e820: [mem 0xfed8-0xfed8] reserved
[0.00]

Re: [PATCH 1/2] clk: exynos5420: Set ID for aclk333 gate clock

2016-05-25 Thread Krzysztof Kozlowski

On 05/24/2016 07:41 PM, Javier Martinez Canillas wrote:
> The aclk333 clock needs to be ungated during the MFC power domain switch,
> so set the clock ID to allow the Exynos power domain logic to lookup this
> clock if is defined in the MFC PD device tree node.
> 
> Signed-off-by: Javier Martinez Canillas 
> ---
> 
>  drivers/clk/samsung/clk-exynos5420.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

Re: [PATCH] lightnvm: clear reserved bit on generic addr

2016-05-25 Thread Matias Bjørling


On 05/11/2016 02:08 PM, Javier González wrote:

When an address is converted from device to generic mode, the reserved
bit needs to be cleared in order to signal that the address points to a
flash block, not to a cacheline on the write buffer.

Signed-off-by: Javier González 
---
  include/linux/lightnvm.h | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
index 45be892..3d2c380 100644
--- a/include/linux/lightnvm.h
+++ b/include/linux/lightnvm.h
@@ -418,6 +418,9 @@ static inline struct ppa_addr dev_to_generic_addr(struct 
nvm_dev *dev,
l.g.ch |= (r.ppa >> dev->ppaf.ch_offset) &
(((1 << dev->ppaf.ch_len) - 1));

+   /* On device side, reserved bit is always 0 */
+   l.g.reserved = 0;
+
return l;
  }




Thanks Javier. Applied for 4.8. I have changed it to l.ppa = 0 and 
updated the description a bit.

Re: [PATCH 2/5] asus-wmi: Create quirk for airplane_mode LED

2016-05-25 Thread Corentin Chary

On Mon, Feb 8, 2016 at 6:05 PM, João Paulo Rechi Vita  wrote:
> Some Asus laptops that have an "airplane mode" indicator LED, also have
> the WMI WLAN user bit set, and the following bits in their DSDT:
>
> Scope (_SB)
> {
>   (...)
>   Device (ATKD)
>   {
> (...)
> Method (WMNB, 3, Serialized)
> {
>   (...)
>   If (LEqual (IIA0, 0x00010002))
>   {
> OWGD (IIA1)
> Return (One)
>   }
> }
>   }
> }
>
> So when asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store the
> wlan state, it drives the airplane mode indicator LED (through the call
> to OWGD) in an inverted fashion: the LED is ON when airplane mode is OFF
> (since wlan is ON), and vice-versa.
>
> This commit creates a quirk to not register a RFKill switch at all for
> these laptops, to allow the asus-wireless driver to drive the airplane
> mode LED correctly. It also adds a match to that quirk for the Asus
> X555UB.

This is really something that should get merged, multiple users are
affected by this. I do not own any of these laptops, but would there
be a way to detect this behavior instead of having static quircks ?

> Signed-off-by: João Paulo Rechi Vita 
> ---
>  drivers/platform/x86/asus-nb-wmi.c | 13 +
>  drivers/platform/x86/asus-wmi.c|  8 +---
>  drivers/platform/x86/asus-wmi.h|  1 +
>  3 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/platform/x86/asus-nb-wmi.c 
> b/drivers/platform/x86/asus-nb-wmi.c
> index 131fee2..cfee863 100644
> --- a/drivers/platform/x86/asus-nb-wmi.c
> +++ b/drivers/platform/x86/asus-nb-wmi.c
> @@ -78,6 +78,10 @@ static struct quirk_entry quirk_asus_x200ca = {
> .wapf = 2,
>  };
>
> +static struct quirk_entry quirk_no_rfkill = {
> +   .no_rfkill = true,
> +};
> +
>  static int dmi_matched(const struct dmi_system_id *dmi)
>  {
> quirks = dmi->driver_data;
> @@ -297,6 +301,15 @@ static const struct dmi_system_id asus_quirks[] = {
> },
> .driver_data = &quirk_asus_x200ca,
> },
> +   {
> +   .callback = dmi_matched,
> +   .ident = "ASUSTeK COMPUTER INC. X555UB",
> +   .matches = {
> +   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> +   DMI_MATCH(DMI_PRODUCT_NAME, "X555UB"),
> +   },
> +   .driver_data = &quirk_no_rfkill,
> +   },
> {},
>  };
>
> diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
> index a96630d..370fa347 100644
> --- a/drivers/platform/x86/asus-wmi.c
> +++ b/drivers/platform/x86/asus-wmi.c
> @@ -2064,9 +2064,11 @@ static int asus_wmi_add(struct platform_device *pdev)
> if (err)
> goto fail_leds;
>
> -   err = asus_wmi_rfkill_init(asus);
> -   if (err)
> -   goto fail_rfkill;
> +   if (!asus->driver->quirks->no_rfkill) {
> +   err = asus_wmi_rfkill_init(asus);
> +   if (err)
> +   goto fail_rfkill;
> +   }
>
> /* Some Asus desktop boards export an acpi-video backlight interface,
>stop this from showing up */
> diff --git a/drivers/platform/x86/asus-wmi.h b/drivers/platform/x86/asus-wmi.h
> index 4da4c8b..5de1df5 100644
> --- a/drivers/platform/x86/asus-wmi.h
> +++ b/drivers/platform/x86/asus-wmi.h
> @@ -38,6 +38,7 @@ struct key_entry;
>  struct asus_wmi;
>
>  struct quirk_entry {
> +   bool no_rfkill;
> bool hotplug_wireless;
> bool scalar_panel_brightness;
> bool store_backlight_power;
> --
> 2.5.0
>



-- 
Corentin Chary
http://xf.iksaif.net

Re: [PATCH] mm: check the return value of lookup_page_ext for all call sites

2016-05-25 Thread shakil




On 5/23/2016 10:16 AM, Yang Shi wrote:

Per the discussion with Joonsoo Kim [1], we need check the return value of
lookup_page_ext() for all call sites since it might return NULL in some cases,
although it is unlikely, i.e. memory hotplug.

Tested with ltp with "page_owner=0".

[1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE

Signed-off-by: Yang Shi 
---
  include/linux/page_idle.h | 43 ---
  mm/page_alloc.c   |  6 ++
  mm/page_owner.c   | 27 +++
  mm/page_poison.c  |  8 +++-
  mm/vmstat.c   |  2 ++
  5 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h
index bf268fa..8f5d4ad 100644
--- a/include/linux/page_idle.h
+++ b/include/linux/page_idle.h
@@ -46,33 +46,62 @@ extern struct page_ext_operations page_idle_ops;
  
  static inline bool page_is_young(struct page *page)

  {
-   return test_bit(PAGE_EXT_YOUNG, &lookup_page_ext(page)->flags);
+   struct page_ext *page_ext;
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext)
+   return false;
+
+   return test_bit(PAGE_EXT_YOUNG, &page_ext->flags);
  }
  
  static inline void set_page_young(struct page *page)

  {
-   set_bit(PAGE_EXT_YOUNG, &lookup_page_ext(page)->flags);
+   struct page_ext *page_ext;
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext)
+   return;
+
+   set_bit(PAGE_EXT_YOUNG, &page_ext->flags);
  }
  
  static inline bool test_and_clear_page_young(struct page *page)

  {
-   return test_and_clear_bit(PAGE_EXT_YOUNG,
- &lookup_page_ext(page)->flags);
+   struct page_ext *page_ext;
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext)
+   return false;
+
+   return test_and_clear_bit(PAGE_EXT_YOUNG, &page_ext->flags);
  }
  
  static inline bool page_is_idle(struct page *page)

  {
-   return test_bit(PAGE_EXT_IDLE, &lookup_page_ext(page)->flags);
+   struct page_ext *page_ext;
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext)
+   return false;
+
+   return test_bit(PAGE_EXT_IDLE, &page_ext->flags);
  }
  
  static inline void set_page_idle(struct page *page)

  {
-   set_bit(PAGE_EXT_IDLE, &lookup_page_ext(page)->flags);
+   struct page_ext *page_ext;
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext)
+   return;
+
+   set_bit(PAGE_EXT_IDLE, &page_ext->flags);
  }
  
  static inline void clear_page_idle(struct page *page)

  {
-   clear_bit(PAGE_EXT_IDLE, &lookup_page_ext(page)->flags);
+   struct page_ext *page_ext;
+   page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext)
+   return;
+
+   clear_bit(PAGE_EXT_IDLE, &page_ext->flags);
  }
  #endif /* CONFIG_64BIT */
  
diff --git a/mm/page_alloc.c b/mm/page_alloc.c

index f8f3bfc..d27e8b9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -656,6 +656,9 @@ static inline void set_page_guard(struct zone *zone, struct 
page *page,
return;
  
  	page_ext = lookup_page_ext(page);

+   if (unlikely(!page_ext))
+   return;
+
__set_bit(PAGE_EXT_DEBUG_GUARD, &page_ext->flags);
  
  	INIT_LIST_HEAD(&page->lru);

@@ -673,6 +676,9 @@ static inline void clear_page_guard(struct zone *zone, 
struct page *page,
return;
  
  	page_ext = lookup_page_ext(page);

+   if (unlikely(!page_ext))
+   return;
+
__clear_bit(PAGE_EXT_DEBUG_GUARD, &page_ext->flags);
  
  	set_page_private(page, 0);

diff --git a/mm/page_owner.c b/mm/page_owner.c
index 792b56d..902e398 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -55,6 +55,8 @@ void __reset_page_owner(struct page *page, unsigned int order)
  
  	for (i = 0; i < (1 << order); i++) {

page_ext = lookup_page_ext(page + i);
+   if (unlikely(!page_ext))
+   continue;
__clear_bit(PAGE_EXT_OWNER, &page_ext->flags);
}
  }
@@ -62,6 +64,10 @@ void __reset_page_owner(struct page *page, unsigned int 
order)
  void __set_page_owner(struct page *page, unsigned int order, gfp_t gfp_mask)
  {
struct page_ext *page_ext = lookup_page_ext(page);
+
+   if (unlikely(!page_ext))
+   return;
+
struct stack_trace trace = {
.nr_entries = 0,
.max_entries = ARRAY_SIZE(page_ext->trace_entries),
@@ -82,6 +88,8 @@ void __set_page_owner(struct page *page, unsigned int order, 
gfp_t gfp_mask)
  void __set_page_owner_migrate_reason(struct page *page, int reason)
  {
struct page_ext *page_ext = lookup_page_ext(page);
+   if (unlikely(!page_ext))
+   return;
  
  	page_ext->last_migrate_reason = reason;

  }
@@ -89,6 +97,12 @@ void __set_page_owner_migrate_reason(str

[tip:sched/urgent] sched/core: Fix remote wakeups

2016-05-25 Thread tip-bot for Peter Zijlstra

Commit-ID:  b7e7ade34e6188bee2e3b0d42b51d25137d9e2a5
Gitweb: http://git.kernel.org/tip/b7e7ade34e6188bee2e3b0d42b51d25137d9e2a5
Author: Peter Zijlstra 
AuthorDate: Mon, 23 May 2016 11:19:07 +0200
Committer:  Ingo Molnar 
CommitDate: Wed, 25 May 2016 08:35:18 +0200

sched/core: Fix remote wakeups

Commit:

  b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration")

... introduced a bug: Mike Galbraith found that it introduced a
performance regression, while Paul E. McKenney reported lost
wakeups and bisected it to this commit.

The reason is that I mis-read ttwu_queue() such that I assumed any
wakeup that got a remote queue must have had the task migrated.

Since this is not so; we need to transfer this information between
queueing the wakeup and actually doing the wakeup. Use a new
task_struct::sched_flag for this, we already write to
sched_contributes_to_load in the wakeup path so this is a hot and
modified cacheline.

Reported-by: Paul E. McKenney 
Reported-by: Mike Galbraith 
Tested-by: Mike Galbraith 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Andrew Hunter 
Cc: Andy Lutomirski 
Cc: Ben Segall 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dave Hansen 
Cc: Denys Vlasenko 
Cc: Fenghua Yu 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Morten Rasmussen 
Cc: Oleg Nesterov 
Cc: Paul Turner 
Cc: Pavan Kondeti 
Cc: Peter Zijlstra 
Cc: Quentin Casasnovas 
Cc: Thomas Gleixner 
Cc: byungchul.p...@lge.com
Fixes: b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on 
migration")
Link: http://lkml.kernel.org/r/20160523091907.gd15...@worktop.ger.corp.intel.com
Signed-off-by: Ingo Molnar 
---
 include/linux/sched.h |  1 +
 kernel/sched/core.c   | 18 +++---
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6cc0df9..e053517 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1533,6 +1533,7 @@ struct task_struct {
unsigned sched_reset_on_fork:1;
unsigned sched_contributes_to_load:1;
unsigned sched_migrated:1;
+   unsigned sched_remote_wakeup:1;
unsigned :0; /* force alignment to the next boundary */
 
/* unserialized, strictly 'current' */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 404c078..7f2cae4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void)
cookie = lockdep_pin_lock(&rq->lock);
 
while (llist) {
+   int wake_flags = 0;
+
p = llist_entry(llist, struct task_struct, wake_entry);
llist = llist_next(llist);
-   /*
-* See ttwu_queue(); we only call ttwu_queue_remote() when
-* its a x-cpu wakeup.
-*/
-   ttwu_do_activate(rq, p, WF_MIGRATED, cookie);
+
+   if (p->sched_remote_wakeup)
+   wake_flags = WF_MIGRATED;
+
+   ttwu_do_activate(rq, p, wake_flags, cookie);
}
 
lockdep_unpin_lock(&rq->lock, cookie);
@@ -1819,10 +1821,12 @@ void scheduler_ipi(void)
irq_exit();
 }
 
-static void ttwu_queue_remote(struct task_struct *p, int cpu)
+static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
 {
struct rq *rq = cpu_rq(cpu);
 
+   p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
+
if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
if (!set_nr_if_polling(rq->idle))
smp_send_reschedule(cpu);
@@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, int cpu, 
int wake_flags)
 #if defined(CONFIG_SMP)
if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), 
cpu)) {
sched_clock_cpu(cpu); /* sync clocks x-cpu */
-   ttwu_queue_remote(p, cpu);
+   ttwu_queue_remote(p, cpu, wake_flags);
return;
}
 #endif

Re: [Patch v4 0/9] * Fix kdump failure in system with amd iommu*

2016-05-25 Thread Baoquan He

Sorry, log of 'lspci -vvv' is not attatched correclty. Re-attach it here.

00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Root Complex
Subsystem: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 
30h-3fh) Processor Root Complex
Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- 

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Kaveri [Radeon R7 Graphics] (prog-if 00 [VGA controller])
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0123
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: radeon
Kernel modules: radeon

00:01.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Kaveri HDMI/DP 
Audio Controller
Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device 0123
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1424
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1424
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- 
Kernel driver in use: xhci_hcd

00:10.1 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI 
Controller (rev 09) (prog-if 30 [XHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: xhci_hcd

00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller 
[IDE mode] (rev 40) (prog-if 01 [AHCI 1.0])
Subsystem: Gigabyte Technology Co., Ltd Device b002
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR- 
Kernel driver in use: ahci

00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- 
Kernel driver in use: ehci-pci

00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- 
Kernel driver in use: ehci-pci

00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 16)
Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- TAbort- SERR- 
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 11)
Subsystem: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- TAbort- 
Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-

00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI 
Controller (rev 11) (prog-if 10 [OHCI])
Subsystem: Gigabyte Technology Co., Ltd Device 5004
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR-

[PATCH 00/10] String hash improvements

2016-05-25 Thread George Spelvin

On Tue, 17 May 2016 at 09:32, Linus Torvalds  
wrote:
> On Tue, May 17, 2016 at 6:41 AM, George Spelvin  wrote:
>> I had assumed that since they weren't fully baked when the window opened,
>> they weren't eligible, but I'll try.

> Hey, if they aren't ready, they aren't.

Well, they're close, and I can and did *get* them ready.

> How about just the minimal set of patches that you'er happy with as-is?

The things are a bit interdependent.  I can't fix hash_64() on 32-bit systems 
until
I get rid of hash_string()'s need for it to return 64 bits, which requires work
on the dcache hashes to make them suitable replacements...

The real fun has come from TPTB deciding to sell the horizon.com domain,
and it turns out that updating rDNS takes the ISP a whole freaking week,
during which time outgoing mail trips everyone's spam filters.

That finally got fixed, just in time for me to put my dominant hand through
a piece of glass.  It's been a week. :-(

Anyway, the patches...

This series does several related things:
1) Gets rid of the string hashes in ,
   and uses the dcache hash (fs/namei.c) instead.
2) Avoid 64-bit multiplies in hash_64() on 32-bit platforms.
   Two 32-bit multiplies will do well enough.
3) Rids the world of the bad hash multipliers in hash_32.
   This finishes the job started in 689de1d6ca.
   The vast majority of Linux architectures have hardware support
   for 32x32-bit multiply and so derive no benefit from "simplified"
   multipliers.
   The few processors that do not (68000, h8/300 and some models of
   Microblaze) have arch-specific implementations added.  Those patches
   are last in the series so they can go through the relevant arch
   maintainers.
4) Overhauls the dcache hash mixing.
   The patch in 2bf0b16954 was an off-the-cuff suggestion.  Replaced with
   a much more careful design that's simultaneously faster and better.
   (My own invention, as there was noting suitable in the literature I
   could find.  Comments welcome!)

Things I thought about but can wait for now:
5) Modify the hash_name() loop to skip the initial HASH_MIX().
   That would let us salt the hash if we ever wanted to.
6) Modify bytemask_from_count to handle inputs of 1..sizeof(long)
   rather than 0..sizeof(long)-1.  This would simplify all its users
   including full_name_hash.
7) Sort out partial_name_hash().
   The hash function is declared as using a long state, even though
   it's truncated to 32 bits at the end and the extra internal state
   contributes nothing to the result.  And some callers do odd things:
   * fs/hfs/string.c only allocates 32 bits of state
   * fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes

I'm not particularly fond of the names of the header files I created,
but if anyone has a better idea please talk fast!

George Spelvin (10):
  Pull out string hash to 
  fs/namei.c: Add hash_string() function.
  : Define hash_str() in terms of hash_string()
  Change hash_64() return value to 32 bits.
  Eliminate bad hash multipliers from hash_32() and hash_64().
  fs/namei.c: Improve dcache hash function
  : Add support for architecture-specific functions
  m68k: Add 
  microblaze: Add 
  h8300: Add 

 arch/Kconfig   |   8 ++
 arch/h8300/Kconfig |   1 +
 arch/h8300/include/asm/archhash.h  |  52 
 arch/m68k/Kconfig  |   1 +
 arch/m68k/include/asm/archhash.h   |  67 +++
 arch/microblaze/Kconfig|   1 +
 arch/microblaze/include/asm/archhash.h |  80 ++
 fs/namei.c | 149 +
 include/linux/dcache.h |  27 +-
 include/linux/hash.h   | 111 
 include/linux/stringhash.h |  76 +
 include/linux/sunrpc/svcauth.h |  36 ++--
 12 files changed, 464 insertions(+), 145 deletions(-)
 create mode 100644 arch/h8300/include/asm/archhash.h
 create mode 100644 arch/m68k/include/asm/archhash.h
 create mode 100644 arch/microblaze/include/asm/archhash.h
 create mode 100644 include/linux/stringhash.h

-- 
2.8.1

[PATCH 01/10] Pull out string hash to

2016-05-25 Thread George Spelvin

... so they can be used without the rest of 

The hashlen_* macros will make sense next patch.

Signed-off-by: George Spelvin 
---
 include/linux/dcache.h | 27 +
 include/linux/stringhash.h | 72 ++
 2 files changed, 73 insertions(+), 26 deletions(-)
 create mode 100644 include/linux/stringhash.h

diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 7e9422cb..0f9a977c 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct path;
 struct vfsmount;
@@ -52,9 +53,6 @@ struct qstr {
 };
 
 #define QSTR_INIT(n,l) { { { .len = l } }, .name = n }
-#define hashlen_hash(hashlen) ((u32) (hashlen))
-#define hashlen_len(hashlen)  ((u32)((hashlen) >> 32))
-#define hashlen_create(hash,len) (((u64)(len)<<32)|(u32)(hash))
 
 struct dentry_stat_t {
long nr_dentry;
@@ -65,29 +63,6 @@ struct dentry_stat_t {
 };
 extern struct dentry_stat_t dentry_stat;
 
-/* Name hashing routines. Initial hash value */
-/* Hash courtesy of the R5 hash in reiserfs modulo sign bits */
-#define init_name_hash()   0
-
-/* partial hash update function. Assume roughly 4 bits per character */
-static inline unsigned long
-partial_name_hash(unsigned long c, unsigned long prevhash)
-{
-   return (prevhash + (c << 4) + (c >> 4)) * 11;
-}
-
-/*
- * Finally: cut down the number of bits to a int value (and try to avoid
- * losing bits)
- */
-static inline unsigned long end_name_hash(unsigned long hash)
-{
-   return (unsigned int) hash;
-}
-
-/* Compute the hash for a name string. */
-extern unsigned int full_name_hash(const unsigned char *, unsigned int);
-
 /*
  * Try to keep struct dentry aligned on 64 byte cachelines (this will
  * give reasonable cacheline footprint with larger lines without the
diff --git a/include/linux/stringhash.h b/include/linux/stringhash.h
new file mode 100644
index ..144d8c0f
--- /dev/null
+++ b/include/linux/stringhash.h
@@ -0,0 +1,72 @@
+#ifndef __LINUX_STRINGHASH_H
+#define __LINUX_STRINGHASH_H
+
+#include 
+
+/*
+ * Routines for hashing strings of bytes to a 32-bit hash value.
+ *
+ * These hash functions are NOT GUARANTEED STABLE between kernel
+ * versions, architectures, or even repeated boots of the same kernel.
+ * (E.g. they may depend on boot-time hardware detection or be
+ * deliberately randomized.)
+ *
+ * They are also not intended to be secure against collisions caused by
+ * malicious inputs; much slower hash functions are required for that.
+ *
+ * They are optimized for pathname components, meaning short strings.
+ * Even if a majority of files have longer names, the dynamic profile of
+ * pathname components skews short due to short directory names.
+ * (E.g. /usr/lib/libsesquipedalianism.so.3.141.)
+ */
+
+/*
+ * Version 1: one byte at a time.  Example of use:
+ *
+ * unsigned long hash = init_name_hash;
+ * while (*p)
+ * hash = partial_name_hash(tolower(*p++), hash);
+ * hash = end_name_hash(hash);
+ *
+ * Although this is designed for bytes, fs/hfsplus/unicode.c
+ * abuses it to hash 16-bit values.
+ */
+
+/* Hash courtesy of the R5 hash in reiserfs modulo sign bits */
+#define init_name_hash()   0
+
+/* partial hash update function. Assume roughly 4 bits per character */
+static inline unsigned long
+partial_name_hash(unsigned long c, unsigned long prevhash)
+{
+   return (prevhash + (c << 4) + (c >> 4)) * 11;
+}
+
+/*
+ * Finally: cut down the number of bits to a int value (and try to avoid
+ * losing bits)
+ */
+static inline unsigned long end_name_hash(unsigned long hash)
+{
+   return (unsigned int)hash;
+}
+
+/*
+ * Version 2: One word (32 or 64 bits) at a time.
+ * If CONFIG_DCACHE_WORD_ACCESS is defined (meaning 
+ * exists, which describes major Linux platforms like x86 and ARM), then
+ * this computes a different hash function much faster.
+ *
+ * If not set, this falls back to a wrapper around the preceding.
+ */
+extern unsigned int full_name_hash(const unsigned char *, unsigned int);
+
+/*
+ * A hash_len is a u64 with the hash of a string in the low
+ * half and the length in the high half.
+ */
+#define hashlen_hash(hashlen) ((u32)(hashlen))
+#define hashlen_len(hashlen)  ((u32)((hashlen) >> 32))
+#define hashlen_create(hash, len) ((u64)(len)<<32 | (u32)(hash))
+
+#endif /* __LINUX_STRINGHASH_H */
-- 
2.8.1

Re: [PATCH 4.2.y-ckt 53/53] uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h

2016-05-25 Thread Mikko Rapeli

On Tue, May 24, 2016 at 10:55:23AM -0700, Kamal Mostafa wrote:
> 4.2.8-ckt11 -stable review patch.  If anyone has any objections, please let 
> me know.
> 
> ---8<
> 
> From: Mikko Rapeli 
> 
> [ Upstream commit 4a91cb61bb995e5571098188092e296192309c77 ]
> 
> glibc's net/if.h contains copies of definitions from linux/if.h and these
> conflict and cause build failures if both files are included by application
> source code. Changes in uapi headers, which fixed header file dependencies to
> include linux/if.h when it was needed, e.g. commit 1ffad83d, made the
> net/if.h and linux/if.h incompatibilities visible as build failures for
> userspace applications like iproute2 and xtables-addons.

Commit 1ffad83d from me was released in v4.4-rc2. Before that the linux/if.h
conflict with glibc net/if.h was hidden from most users of kernel uapi headers.

IMO, there is no need to backport this fix to kernel trees older than v4.4.

-Mikko

> This patch fixes compile errors when glibc net/if.h is included before
> linux/if.h:
> 
> ./linux/if.h:99:21: error: redeclaration of enumerator ‘IFF_NOARP’
> ./linux/if.h:98:23: error: redeclaration of enumerator ‘IFF_RUNNING’
> ./linux/if.h:97:26: error: redeclaration of enumerator ‘IFF_NOTRAILERS’
> ./linux/if.h:96:27: error: redeclaration of enumerator ‘IFF_POINTOPOINT’
> ./linux/if.h:95:24: error: redeclaration of enumerator ‘IFF_LOOPBACK’
> ./linux/if.h:94:21: error: redeclaration of enumerator ‘IFF_DEBUG’
> ./linux/if.h:93:25: error: redeclaration of enumerator ‘IFF_BROADCAST’
> ./linux/if.h:92:19: error: redeclaration of enumerator ‘IFF_UP’
> ./linux/if.h:252:8: error: redefinition of ‘struct ifconf’
> ./linux/if.h:203:8: error: redefinition of ‘struct ifreq’
> ./linux/if.h:169:8: error: redefinition of ‘struct ifmap’
> ./linux/if.h:107:23: error: redeclaration of enumerator ‘IFF_DYNAMIC’
> ./linux/if.h:106:25: error: redeclaration of enumerator ‘IFF_AUTOMEDIA’
> ./linux/if.h:105:23: error: redeclaration of enumerator ‘IFF_PORTSEL’
> ./linux/if.h:104:25: error: redeclaration of enumerator ‘IFF_MULTICAST’
> ./linux/if.h:103:21: error: redeclaration of enumerator ‘IFF_SLAVE’
> ./linux/if.h:102:22: error: redeclaration of enumerator ‘IFF_MASTER’
> ./linux/if.h:101:24: error: redeclaration of enumerator ‘IFF_ALLMULTI’
> ./linux/if.h:100:23: error: redeclaration of enumerator ‘IFF_PROMISC’
> 
> The cases where linux/if.h is included before net/if.h need a similar fix in
> the glibc side, or the order of include files can be changed userspace
> code as a workaround.
> 
> This change was tested in x86 userspace on Debian unstable with
> scripts/headers_compile_test.sh:
> 
> $ make headers_install && \
>   cd usr/include && ../../scripts/headers_compile_test.sh -l -k
> ...
> cc -Wall -c -nostdinc -I /usr/lib/gcc/i586-linux-gnu/5/include -I 
> /usr/lib/gcc/i586-linux-gnu/5/include-fixed -I . -I 
> /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH -I 
> /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH/i586-linux-gnu
>  -o /dev/null ./linux/if.h_libc_before_kernel.h
> PASSED libc before kernel test: ./linux/if.h
> 
> Reported-by: Jan Engelhardt 
> Reported-by: Josh Boyer 
> Reported-by: Stephen Hemminger 
> Reported-by: Waldemar Brodkorb 
> Cc: Gabriel Laskar 
> Signed-off-by: Mikko Rapeli 
> Signed-off-by: David S. Miller 
> Signed-off-by: Kamal Mostafa 
> ---
>  include/uapi/linux/if.h  | 28 +
>  include/uapi/linux/libc-compat.h | 44 
> 
>  2 files changed, 72 insertions(+)
> 
> diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
> index 9cf2394..752f5dc 100644
> --- a/include/uapi/linux/if.h
> +++ b/include/uapi/linux/if.h
> @@ -19,14 +19,20 @@
>  #ifndef _LINUX_IF_H
>  #define _LINUX_IF_H
>  
> +#include   /* for compatibility with glibc */
>  #include  /* for "__kernel_caddr_t" et al */
>  #include /* for "struct sockaddr" et al  */
>  #include   /* for "__user" et al   */
>  
> +#if __UAPI_DEF_IF_IFNAMSIZ
>  #define  IFNAMSIZ16
> +#endif /* __UAPI_DEF_IF_IFNAMSIZ */
>  #define  IFALIASZ256
>  #include 
>  
> +/* For glibc compatibility. An empty enum does not compile. */
> +#if __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO != 0 && \
> +__UAPI_DEF_IF_NET_DEVICE_FLAGS != 0
>  /**
>   * enum net_device_flags - &struct net_device flags
>   *
> @@ -68,6 +74,8 @@
>   * @IFF_ECHO: echo sent packets. Volatile.
>   */
>  enum net_device_flags {
> +/* for compatibility with glibc net/if.h */
> +#if __UAPI_DEF_IF_NET_DEVICE_FLAGS
>   IFF_UP  = 1<<0,  /* sysfs */
>   IFF_BROADCAST   = 1<<1,  /* volatile */
>   IFF_DEBUG   = 1<<2,  /* sysfs */
> @@ -84,11 +92,17 @@ enum net_device_flags {
>   IFF_PORTSEL = 1<<13, /* sysfs */
>   IFF_AU

Re: [PATCH v3 00/12] J-core J2 cpu and SoC peripherals support

2016-05-25 Thread Geert Uytterhoeven

Hi Rich,

On Wed, May 25, 2016 at 7:43 AM, Rich Felker  wrote:
> As arch/sh co-maintainer my intent is to include as much as possible
> in my pull request for the linux-sh tree. If there are parts outside
> of arch/sh that can be included in this, please let me know. I'm not
> clear yet on what the right path to upstream is for the clocksource
> and irq drivers that are currently only useful/interesting for one
> arch, or for the DT binding patches. Even if some drivers are delayed

Drivers outside arch/sh/ should go through the respective maintainer's tree,
unless that maintainer allows you to take it by giving his Acked-by.
The same for DT bindings for such drivers.

> going upstream, I would really like to get DT bindings acked and
> ideally merged, because we want to go ahead with moving the DTB into
> J2 boot rom where it belongs, and that should only happen with stable
> bindings.

Please also add a changelog, so people who reviewed your previous series
before know what to skip and what to focus on.
Please add collected Acked-by and Reviewed-by tags when reposting.

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH 02/10] fs/namei.c: Add hash_string() function

2016-05-25 Thread George Spelvin

It's not actually used in that file, but goes with hash_name() and
full_name_hash(), so it's simplest to keep it there.

Also simplify the prototype of full_name_hash to be take
a "char *" rather than "unsigned char *".  That's consistent
with hash_name().

Signed-off-by: George Spelvin 
---
 fs/namei.c | 47 +++---
 include/linux/stringhash.h |  8 ++--
 2 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 42f8ca03..ce640d65 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1822,7 +1822,8 @@ static inline unsigned long mix_hash(unsigned long hash)
 
 #endif
 
-unsigned int full_name_hash(const unsigned char *name, unsigned int len)
+/* Return the hash of a string of known length */
+unsigned int full_name_hash(const char *name, unsigned int len)
 {
unsigned long a, hash = 0;
 
@@ -1842,6 +1843,29 @@ done:
 }
 EXPORT_SYMBOL(full_name_hash);
 
+/* Return the "hash_len" (hash and length) of a null-terminated string */
+u64 hash_string(const char *name)
+{
+   unsigned long a, adata, mask, hash, len;
+   const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
+
+   hash = a = 0;
+   len = -sizeof(unsigned long);
+   do {
+   hash = mix_hash(hash + a);
+   len += sizeof(unsigned long);
+   a = load_unaligned_zeropad(name+len);
+   } while (!has_zero(a, &adata, &constants));
+
+   adata = prep_zero_mask(a, adata, &constants);
+   mask = create_zero_mask(adata);
+   hash += a & zero_bytemask(mask);
+   len += find_zero(mask);
+
+   return hashlen_create(fold_hash(hash), len);
+}
+EXPORT_SYMBOL(hash_string);
+
 /*
  * Calculate the length and hash of the path component, and
  * return the "hash_len" as the result.
@@ -1872,15 +1896,32 @@ static inline u64 hash_name(const char *name)
 
 #else
 
-unsigned int full_name_hash(const unsigned char *name, unsigned int len)
+/* Return the hash of a string of known length */
+unsigned int full_name_hash(const char *name, unsigned int len)
 {
unsigned long hash = init_name_hash();
while (len--)
-   hash = partial_name_hash(*name++, hash);
+   hash = partial_name_hash((unsigned char)*name++, hash);
return end_name_hash(hash);
 }
 EXPORT_SYMBOL(full_name_hash);
 
+/* Return the "hash_len" (hash and length) of a null-terminated string */
+u64 hash_string(const char *name)
+{
+   unsigned long hash = init_name_hash();
+   unsigned long len = 0, c;
+
+   c = (unsigned char)*name;
+   do {
+   len++;
+   hash = partial_name_hash(c, hash);
+   c = (unsigned char)name[len];
+   } while (c);
+   return hashlen_create(end_name_hash(hash), len);
+}
+EXPORT_SYMBOL(hash_string);
+
 /*
  * We know there's a real path component here of at least
  * one character.
diff --git a/include/linux/stringhash.h b/include/linux/stringhash.h
index 144d8c0f..d1c80ba2 100644
--- a/include/linux/stringhash.h
+++ b/include/linux/stringhash.h
@@ -1,7 +1,8 @@
 #ifndef __LINUX_STRINGHASH_H
 #define __LINUX_STRINGHASH_H
 
-#include 
+#include /* For __pure */
+#include/* For u32, u64 */
 
 /*
  * Routines for hashing strings of bytes to a 32-bit hash value.
@@ -59,7 +60,7 @@ static inline unsigned long end_name_hash(unsigned long hash)
  *
  * If not set, this falls back to a wrapper around the preceding.
  */
-extern unsigned int full_name_hash(const unsigned char *, unsigned int);
+extern unsigned int __pure full_name_hash(const char *, unsigned int);
 
 /*
  * A hash_len is a u64 with the hash of a string in the low
@@ -69,4 +70,7 @@ extern unsigned int full_name_hash(const unsigned char *, 
unsigned int);
 #define hashlen_len(hashlen)  ((u32)((hashlen) >> 32))
 #define hashlen_create(hash,len) ((u64)(len)<<32 | (u32)(hash))
 
+/* Return the "hash_len" (hash and length) of a null-terminated string */
+extern u64 __pure hash_string(const char *name);
+
 #endif /* __LINUX_STRINGHASH_H */
-- 
2.8.1

[PATCH 03/10] : Define hash_str() in terms of hash_string()

2016-05-25 Thread George Spelvin

Finally, the first use of previous two patches: Eliminate the
separate ad-hoc string hash functions in the sunrpc code.

This also eliminates the only caller of hash_long which asks for
more than 32 bits of output.

sunrpc guys: Is it okay if I send this to Linus directly?

Signed-off-by: George Spelvin 
Cc: "J. Bruce Fields" 
Cc: Jeff Layton 
Cc: linux-...@vger.kernel.org
---
 include/linux/sunrpc/svcauth.h | 36 +---
 1 file changed, 5 insertions(+), 31 deletions(-)

diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index c00f53a4..ef2b2552 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct svc_cred {
@@ -165,41 +166,14 @@ extern int svcauth_unix_set_client(struct svc_rqst 
*rqstp);
 extern int unix_gid_cache_create(struct net *net);
 extern void unix_gid_cache_destroy(struct net *net);
 
-static inline unsigned long hash_str(char *name, int bits)
+static inline unsigned long hash_str(char const *name, int bits)
 {
-   unsigned long hash = 0;
-   unsigned long l = 0;
-   int len = 0;
-   unsigned char c;
-   do {
-   if (unlikely(!(c = *name++))) {
-   c = (char)len; len = -1;
-   }
-   l = (l << 8) | c;
-   len++;
-   if ((len & (BITS_PER_LONG/8-1))==0)
-   hash = hash_long(hash^l, BITS_PER_LONG);
-   } while (len);
-   return hash >> (BITS_PER_LONG - bits);
+   return hash_32(hashlen_hash(hash_string(name)), bits);
 }
 
-static inline unsigned long hash_mem(char *buf, int length, int bits)
+static inline unsigned long hash_mem(char const *buf, int length, int bits)
 {
-   unsigned long hash = 0;
-   unsigned long l = 0;
-   int len = 0;
-   unsigned char c;
-   do {
-   if (len == length) {
-   c = (char)len; len = -1;
-   } else
-   c = *buf++;
-   l = (l << 8) | c;
-   len++;
-   if ((len & (BITS_PER_LONG/8-1))==0)
-   hash = hash_long(hash^l, BITS_PER_LONG);
-   } while (len);
-   return hash >> (BITS_PER_LONG - bits);
+   return hash_32(full_name_hash(buf, length), bits);
 }
 
 #endif /* __KERNEL__ */
-- 
2.8.1

[PATCH 04/10] Change hash_64() return value to 32 bits

2016-05-25 Thread George Spelvin

That's all that's ever asked for, and it makes the return
type of hash_long() consistent.

It also allows (upcoming patch) an optimized implementation
of hash_64 on 32-bit machines.

There's a WARN_ON in there in case I missed anything.  Most callers pass
a compile-time constant bits and will have no run-time overhead.

Signed-off-by: George Spelvin 
---
 include/linux/hash.h | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/linux/hash.h b/include/linux/hash.h
index 79c52fa8..b9201c33 100644
--- a/include/linux/hash.h
+++ b/include/linux/hash.h
@@ -48,7 +48,7 @@
 #define GOLDEN_RATIO_32 0x61C88647
 #define GOLDEN_RATIO_64 0x61C8864680B583EBull
 
-static __always_inline u64 hash_64(u64 val, unsigned int bits)
+static __always_inline u32 hash_64(u64 val, unsigned int bits)
 {
u64 hash = val;
 
@@ -71,8 +71,14 @@ static __always_inline u64 hash_64(u64 val, unsigned int 
bits)
hash += n;
 #endif
 
+   if (__builtin_constant_p(bits > 32 || bits == 0)) {
+   BUILD_BUG_ON(bits > 32 || bits == 0);
+   } else {
+   WARN_ON(bits > 32 || bits == 0);
+   }
+
/* High bits are more random, so use them. */
-   return hash >> (64 - bits);
+   return (u32)(hash >> (64 - bits));
 }
 
 static inline u32 hash_32(u32 val, unsigned int bits)
@@ -84,7 +90,7 @@ static inline u32 hash_32(u32 val, unsigned int bits)
return hash >> (32 - bits);
 }
 
-static inline unsigned long hash_ptr(const void *ptr, unsigned int bits)
+static inline u32 hash_ptr(const void *ptr, unsigned int bits)
 {
return hash_long((unsigned long)ptr, bits);
 }
-- 
2.8.1

[PATCH 05/10] Eliminate bad hash multipliers from hash_32() and hash_64()

2016-05-25 Thread George Spelvin

To avoid inefficiency, hash_64() on 32-bit systems is changed
to use a different algorithm.  It makes two calls to hash_32()
instead.

Signed-off-by: George Spelvin 
---
 include/linux/hash.h | 100 ++-
 1 file changed, 43 insertions(+), 57 deletions(-)

diff --git a/include/linux/hash.h b/include/linux/hash.h
index b9201c33..8926f369 100644
--- a/include/linux/hash.h
+++ b/include/linux/hash.h
@@ -3,91 +3,76 @@
 /* Fast hashing routine for ints,  longs and pointers.
(C) 2002 Nadia Yvette Chambers, IBM */
 
-/*
- * Knuth recommends primes in approximately golden ratio to the maximum
- * integer representable by a machine word for multiplicative hashing.
- * Chuck Lever verified the effectiveness of this technique:
- * http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf
- *
- * These primes are chosen to be bit-sparse, that is operations on
- * them can use shifts and additions instead of multiplications for
- * machines where multiplications are slow.
- */
-
 #include 
 #include 
 
-/* 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1 */
-#define GOLDEN_RATIO_PRIME_32 0x9e370001UL
-/*  2^63 + 2^61 - 2^57 + 2^54 - 2^51 - 2^18 + 1 */
-#define GOLDEN_RATIO_PRIME_64 0x9e37fffc0001UL
-
+/*
+ * The "GOLDEN_RATIO_PRIME" is used in ifs/btrfs/brtfs_inode.h and
+ * fs/inode.c.  It's not actually prime any more (the previous primes
+ * were actively bad for hashing), but the name remains.
+ */
 #if BITS_PER_LONG == 32
-#define GOLDEN_RATIO_PRIME GOLDEN_RATIO_PRIME_32
+#define GOLDEN_RATIO_PRIME GOLDEN_RATIO_32
 #define hash_long(val, bits) hash_32(val, bits)
 #elif BITS_PER_LONG == 64
 #define hash_long(val, bits) hash_64(val, bits)
-#define GOLDEN_RATIO_PRIME GOLDEN_RATIO_PRIME_64
+#define GOLDEN_RATIO_PRIME GOLDEN_RATIO_64
 #else
 #error Wordsize not 32 or 64
 #endif
 
 /*
- * The above primes are actively bad for hashing, since they are
- * too sparse. The 32-bit one is mostly ok, the 64-bit one causes
- * real problems. Besides, the "prime" part is pointless for the
- * multiplicative hash.
+ * This hash multiplies the input by a large odd number and takes the
+ * high bits.  Since multiplication propagates changes to the most
+ * significant end only, it is essential that the high bits of the
+ * product be used for the hash value.
+ *
+ * Chuck Lever verified the effectiveness of this technique:
+ * http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf
  *
  * Although a random odd number will do, it turns out that the golden
  * ratio phi = (sqrt(5)-1)/2, or its negative, has particularly nice
- * properties.
+ * properties.  (See Knuth vol 3, section 6.4, exercise 9.)
  *
- * These are the negative, (1 - phi) = (phi^2) = (3 - sqrt(5))/2.
- * (See Knuth vol 3, section 6.4, exercise 9.)
+ * These are the negative, (1 - phi) = phi**2 = (3 - sqrt(5))/2,
+ * which is very slightly easier to multiply by and makes no
+ * difference to the hash distribution.
  */
 #define GOLDEN_RATIO_32 0x61C88647
 #define GOLDEN_RATIO_64 0x61C8864680B583EBull
 
+static inline u32 __hash_32(u32 val)
+{
+   return val * GOLDEN_RATIO_32;
+}
+
+static inline u32 hash_32(u32 val, unsigned int bits)
+{
+   /* High bits are more random, so use them. */
+   return __hash_32(val) >> (32 - bits);
+}
+
 static __always_inline u32 hash_64(u64 val, unsigned int bits)
 {
-   u64 hash = val;
-
-#if BITS_PER_LONG == 64
-   hash = hash * GOLDEN_RATIO_64;
-#else
-   /*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
-   u64 n = hash;
-   n <<= 18;
-   hash -= n;
-   n <<= 33;
-   hash -= n;
-   n <<= 3;
-   hash += n;
-   n <<= 3;
-   hash -= n;
-   n <<= 4;
-   hash += n;
-   n <<= 2;
-   hash += n;
-#endif
-
if (__builtin_constant_p(bits > 32 || bits == 0)) {
BUILD_BUG_ON(bits > 32 || bits == 0);
} else {
WARN_ON(bits > 32 || bits == 0);
}
 
-   /* High bits are more random, so use them. */
-   return (unsigned)(hash >> (64 - bits));
-}
-
-static inline u32 hash_32(u32 val, unsigned int bits)
-{
-   /* On some cpus multiply is faster, on others gcc will do shifts */
-   u32 hash = val * GOLDEN_RATIO_PRIME_32;
-
-   /* High bits are more random, so use them. */
-   return hash >> (32 - bits);
+#if BITS_PER_LONG == 64
+   /* 64x64-bit multiply is efficient on all 64-bit processors */
+   return val * GOLDEN_RATIO_64 >> (64 - bits);
+#else
+   /*
+* Hash 64 bits using only 32x32-bit multiply.  GOLDEN_RATIO is
+* phi**2 = 1-phi = 0.38196601.  The square of that is phi**4 =
+* 0.14589803 = 1/6.85, which is starting to have the low bits of
+* (val >> 32) not affect the high bits of the hash.  By subtracting,
+* we end up with phi**3 = 0.23606798, which is a bit better.
+*/
+   return hash_32((u32)val - __hash_32(val >> 32), bits);
+#endif
 }
 
 static

[PATCH 06/10] fs/namei.c: Improve dcache hash function

2016-05-25 Thread George Spelvin

Patch 0fed3ac866 improved the hash mixing, but the function is slower
than necessary; there's a 7-instruction dependency chain (10 on x86)
each loop iteration.

Word-at-a-time access is a very tight loop (which is good, because
link_path_walk() is one of the hottest code paths in the entire kernel),
and the hash mixing function must not have a longer latency to avoid
slowing it down.

There do not appear to be any published fast hash functions that:
1) Operate on the input a word at a time, and
2) Don't need to know the length of the input beforehand, and
3) Have a single iterated mixing function, not needing conditional
   branches or unrolling to distinguish different loop iterations.

One of the algorithms which comes closest is Yann Collet's xxHash, but
that's two dependent multiplies per word, which is too much.

The key insights in this design are:

1) Except for multiplies, to diffuse one input bit across 64 bits of hash
   state takes at least log2(64) = 6 sequentially dependent instructions.
   That is more cycles than we'd like.
2) An operation like "hash ^= hash << 13" requires a second temporary
   register anyway, and on a 2-operand machine like x86, it's three
   instructions.
3) A better use of a second register is to hold a two-word hash state.
   With careful design, no temporaries are needed at all, so it doesn't
   increase register pressure.  And this gets rid of register copying
   on 2-operand machines, so the code is smaller and faster.
4) Using two words of state weakens the requriement for one-round mixing;
   we now have two rounds of mixing before cancellation is possible.
5) A two-word hash state also allows operations on both halves to be
   done in parallel, so on a superscalar processor we get more mixing
   in fewer cycles.

I ended up using a mixing function inspired by the ChaCha and Speck
round functions.  It is 6 simple instructions and 3 cycles per iteration
(assuming mutliply by 9 can be done by an "lea" isntruction):

x ^= *input++;
y ^= x; x = ROL(x, K1);
x += y; y = ROL(y, K2);
y *= 9;

Not only is this reversible, two consecutive rounds are reversible:
if you are given the initial and final states, but not the intermediate
state, it is possible to compute both input words.  This means that at
least 3 words of input are required to create a collision.

(It also has the property, used by hash_name() to avoid a branch, that
it hashes all-zero to all-zero.)

The rotate constants K1 and K2 were found by experiment.  The search took
a sample of random initial states (I used 1023) and considered the effect
of flipping each of the 64 input bits on each of the 128 output bits two
rounds later.  Each of the 8192 pairs can be considered a biased coin, and
adding up the Shannon entropy of all of them produces a score.

The best-scoring shifts also did well in other tests (flipping bits in y,
trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits),
so the choice was made with the additional constraint that the sum of the
shifts is odd and not too close to the word size.

The final state is then folded into a 32-bit hash value by a less carefully
optimized multiply-based scheme.  This also has to be fast, as pathname
components tend to be short (the most common case is one iteration!), but
there's some room for latency, as there is a fair bit of intervening logic
before the hash value is used for anything.

(Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs.  I need
a better benchmark; the numbers seem to show a slight dip in performance
between 4.6.0 and this patch, but they're too noisy to quote.)

Signed-off-by: George Spelvin 
---
 fs/namei.c | 110 -
 1 file changed, 73 insertions(+), 37 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ce640d65..2b8d0650 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "internal.h"
@@ -1788,36 +1789,75 @@ static int walk_component(struct nameidata *nd, int 
flags)
 #include 
 
 #ifdef CONFIG_64BIT
-
-static inline unsigned int fold_hash(unsigned long hash)
-{
-   return hash_64(hash, 32);
-}
+/*
+ * Register pressure in the mixing function is an issue, particularly
+ * on 32-bit x86, but almost any function requires one state value and
+ * one temporary.  Instead, use a function designed for two state values
+ * and no temporaries.
+ *
+ * This function cannot create a collision in only two iterations, so
+ * we have two iterations to achieve avalanche.  In those two iterations,
+ * we have six layers of mixing, which is enough to spread one bit's
+ * influence out to 2^6 = 64 state bits.
+ *
+ * Rotate constants are scored by considering either 64 one-bit input
+ * deltas or 64*63/2 = 2016 two-bit input deltas, and finding the
+ * probability of that delta causing a change to each of the 128 output
+ * bits, using a sample of

[PATCH 07/10] : Add support for architecture-specific functions

2016-05-25 Thread George Spelvin

This is just the infrastructure; there are no users yet.

This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of .

That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.

Signed-off-by: George Spelvin 
Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Cc: linux-m...@lists.linux-m68k.org
Cc: Alistair Francis 
Cc: Michal Simek 
Cc: Yoshinori Sato 
Cc: uclinux-h8-de...@lists.sourceforge.jp
---
 arch/Kconfig |  8 
 fs/namei.c   |  6 +-
 include/linux/hash.h | 11 +++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 81869a5e..33e8d7b1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -589,6 +589,14 @@ config HAVE_STACK_VALIDATION
  Architecture supports the 'objtool check' host tool command, which
  performs compile-time stack metadata validation.
 
+config HAVE_ARCH_HASH
+   bool
+   default n
+   help
+ If this is set, the architecture provides an 
+ file which provides platform-specific implementations of some
+ functions in  or fs/namei.c.
+
 #
 # ABI hall of shame
 #
diff --git a/fs/namei.c b/fs/namei.c
index 2b8d0650..380e8057 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1788,7 +1788,11 @@ static int walk_component(struct nameidata *nd, int 
flags)
 
 #include 
 
-#ifdef CONFIG_64BIT
+#ifdef HASH_MIX
+
+/* Architecture provides HASH_MIX and fold_hash() in  */
+
+#elif defined(CONFIG_64BIT)
 /*
  * Register pressure in the mixing function is an issue, particularly
  * on 32-bit x86, but almost any function requires one state value and
diff --git a/include/linux/hash.h b/include/linux/hash.h
index 8926f369..838bc84b 100644
--- a/include/linux/hash.h
+++ b/include/linux/hash.h
@@ -41,17 +41,27 @@
 #define GOLDEN_RATIO_32 0x61C88647
 #define GOLDEN_RATIO_64 0x61C8864680B583EBull
 
+#ifdef CONFIG_HAVE_ARCH_HASH
+/* This header may use the GOLDEN_RATIO_xx constants */
+#include 
+#endif
+
+#ifndef HAVE_ARCH__HASH_32
 static inline u32 __hash_32(u32 val)
 {
return val * GOLDEN_RATIO_32;
 }
+#endif
 
+#ifndef HAVE_ARCH_HASH_32
 static inline u32 hash_32(u32 val, unsigned int bits)
 {
/* High bits are more random, so use them. */
return __hash_32(val) >> (32 - bits);
 }
+#endif
 
+#ifndef HAVE_ARCH_HASH_64
 static __always_inline u32 hash_64(u64 val, unsigned int bits)
 {
if (__builtin_constant_p(bits > 32 || bits == 0)) {
@@ -74,6 +84,7 @@ static __always_inline u32 hash_64(u64 val, unsigned int bits)
return hash_32((u32)val - __hash_32(val >> 32), bits);
 #endif
 }
+#endif
 
 static inline u32 hash_ptr(const void *ptr, unsigned int bits)
 {
-- 
2.8.1

[PATCH 08/10] m68k: Add

2016-05-25 Thread George Spelvin

[PATCH 08/10] m68k: Add

2016-05-25 Thread George Spelvin

This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
for the original mc68000, which lacks a 32x32-bit multiply instruction.

Yes, the amount of optimization effort put in is excessive. :-)

Addition chains found by Yevgen Voronenko's Hcub algorithm at
http://spiral.ece.cmu.edu/mcm/gen.html

Signed-off-by: George Spelvin 
Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Cc: linux-m...@lists.linux-m68k.org
---
 arch/m68k/Kconfig|  1 +
 arch/m68k/include/asm/archhash.h | 67 
 2 files changed, 68 insertions(+)
 create mode 100644 arch/m68k/include/asm/archhash.h

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 498b567f..95197d5e 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -23,6 +23,7 @@ config M68K
select MODULES_USE_ELF_RELA
select OLD_SIGSUSPEND3
select OLD_SIGACTION
+   select HAVE_ARCH_HASH
 
 config RWSEM_GENERIC_SPINLOCK
bool
diff --git a/arch/m68k/include/asm/archhash.h b/arch/m68k/include/asm/archhash.h
new file mode 100644
index ..c2bb2fc5
--- /dev/null
+++ b/arch/m68k/include/asm/archhash.h
@@ -0,0 +1,67 @@
+#ifndef _ASM_ARCHHASH_H
+#define _ASM_ARCHHASH_H
+
+/*
+ * The only 68k processors that lack MULU.L and so need this workaround
+ * are the original 68000 and 68010.
+ *
+ * Annoyingly, GCC defines __mc68000 for all processors in the family;
+ * the only way to identify an mc68000 is by the *absence* of other
+ * symbols; __mcpu32, __mcoldfire__, __mc68020, etc.
+ */
+#if ! (defined(__mc68020) || \
+   defined(__mc68030) || \
+   defined(__mc68040) || \
+   defined(__mc68060) || \
+   defined(__mcpu32)  || \
+   defined(__mcoldfire))
+
+#define HAVE_ARCH__HASH_32 1
+/*
+ * While it would be legal to substitute a different hash operation
+ * entirely, let's keep it simple and just use an optimized multiply
+ * by GOLDEN_RATIO_32 = 0x61C88647.
+ *
+ * The best way to do that appears to be to multiply by 0x8647 with
+ * shifts and adds, and use mulu.w to multiply the high half by 0x61C8.
+ *
+ * Because the 68000 has multi-cycle shifts, this addition chain is
+ * chosen to minimise the shift distances.
+ *
+ * Despite every attempt to spoon-feed GCC simple operations, GCC 6.1.1
+ * doggedly insists on doing annoying things like converting "lsl.l #2,"
+ * (12 cycles) to two adds (8+8 cycles).
+ *
+ * It also likes to notice two shifts in a row, like "a = x << 2" and
+ * "a <<= 7", and convert that to "a = x << 9".  But shifts longer than
+ * 8 bits are extra-slow on m68k, so that's a lose.
+ *
+ * Since the 68000 is a very simple in-order processor with no instruction
+ * scheduling effects on execution time, we can safely take it out of GCC's
+ * hands and write one big asm() block.
+ *
+ * Without calling overhead, this operation is 30 bytes (14 instructions
+ * plus one immediate constant) and 166 cycles.
+ */
+static inline u32 __attribute_const__ __hash_32(u32 x)
+{
+   u32 a, b;
+
+   asm(   "move.l %2,%0"   /* 0x0001 */
+   "\n lsl.l #2,%0"/* 0x0004 */
+   "\n move.l %0,%1"
+   "\n lsl.l #7,%0"/* 0x0200 */
+   "\n add.l %2,%0"/* 0x0201 */
+   "\n add.l %0,%1"/* 0x0205 */
+   "\n add.l %0,%0"/* 0x0402 */
+   "\n add.l %0,%1"/* 0x0607 */
+   "\n lsl.l #5,%0"/* 0x8040 */
+   /* 0x8647 */
+   : "=&d" (a), "=&r" (b)
+   : "g" (x));
+
+   return ((u16)(x*0x61c8) << 16) + a + b;
+}
+#endif /* HAVE_ARCH__HASH_32 */
+
+#endif /* _ASM_ARCHHASH_H */
-- 
2.8.1

[PATCH 09/10] microblaze: Add

2016-05-25 Thread George Spelvin

Microblaze is an FPGA soft core that can be configured various ways.

If it is configured without a multiplier, the standard __hash_32()
will require a call to __mulsi3, which is a slow software loop.

Instead, use a shift-and-add sequence for the constant multiply.
GCC knows how to do this, but it's not as clever as some.

Signed-off-by: George Spelvin 
Cc: Alistair Francis 
Cc: Michal Simek 
---
 arch/microblaze/Kconfig|  1 +
 arch/microblaze/include/asm/archhash.h | 80 ++
 2 files changed, 81 insertions(+)
 create mode 100644 arch/microblaze/include/asm/archhash.h

diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 3d793b55..ce3e5125 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -16,6 +16,7 @@ config MICROBLAZE
select GENERIC_IRQ_SHOW
select GENERIC_PCI_IOMAP
select GENERIC_SCHED_CLOCK
+   select HAVE_ARCH_HASH
select HAVE_ARCH_KGDB
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_API_DEBUG
diff --git a/arch/microblaze/include/asm/archhash.h 
b/arch/microblaze/include/asm/archhash.h
new file mode 100644
index ..d63cd19e
--- /dev/null
+++ b/arch/microblaze/include/asm/archhash.h
@@ -0,0 +1,80 @@
+#ifndef _ASM_ARCHHASH_H
+#define _ASM_ARCHHASH_H
+
+/*
+ * Fortunately, most people who want to run Linux on Microblaze enable
+ * both multiplier and barrel shifter, but omitting them is technically
+ * a supported configuration.
+ *
+ * With just a barrel shifter, we can implement an efficient constant
+ * multiply using shifts and adds.  GCC can find a 9-step solution, but
+ * this 6-step solution was found by Yevgen Voronenko's implementation
+ * of the Hcub algorithm at http://spiral.ece.cmu.edu/mcm/gen.html.
+ *
+ * That software is really not designed for a single multiplier this large,
+ * but if you run it enough times with different seeds, it'll find several
+ * 6-shift, 6-add sequences for computing x * 0x61C88647.  They are all
+ * c = (x << 19) + x;
+ * a = (x <<  9) + c;
+ * b = (x << 23) + a;
+ * return (a<<11) + (b<<6) + (c<<3) - b;
+ * with variations on the order of the final add.
+ *
+ * Without even a shifter, it's hopless; any hash function will suck.
+ */
+
+#if CONFIG_XILINX_MICROBLAZE0_USE_HW_MUL == 0
+
+#define HAVE_ARCH__HASH_32 1
+
+/* Multiply by GOLDEN_RATIO_32 = 0x61C88647 */
+static inline u32 __attribute_const__ __hash_32(u32 a)
+{
+#if CONFIG_XILINX_MICROBLAZE0_USE_BARREL
+   unsigned b, c;
+
+   /* Phase 1: Compute three intermediate values */
+   b =  a << 23;
+   c = (a << 19) + a;
+   a = (a <<  9) + c;
+   b += a;
+
+   /* Phase 2: Compute (a << 11) + (b << 6) + (c << 3) - b */
+   a <<= 5;
+   a += b; /* (a << 5) + b */
+   a <<= 3;
+   a += c; /* (a << 8) + (b << 3) + c */
+   a <<= 3;
+   return a - b;   /* (a << 11) + (b << 6) + (c << 3) - b */
+#else
+   /*
+* "This is really going to hurt."
+*
+* Without a barrel shifter, left shifts are implemented as
+* repeated additions, and the best we can do is an optimal
+* addition-subtraction chain.  This one is not known to be
+* optimal, but at 37 steps, it's decent for a 31-bit multiplier.
+*
+* Question: given its size (37*4 = 148 bytes per instance),
+* and slowness, is this worth having inline?
+*/
+   unsigned b, c, d;
+   b = a << 4; /* 4*/
+   c = b << 1; /* 1  5 */
+   b += a; /* 1  6 */
+   c += b; /* 1  7 */
+   c <<= 3;/* 3 10 */
+   c -= a; /* 1 11 */
+   d = c << 7; /* 7 18 */
+   d += b; /* 1 19 */
+   d <<= 8;/* 8 27 */
+   d += a; /* 1 28 */
+   d <<= 1;/* 1 29 */
+   d += b; /* 1 30 */
+   d <<= 6;/* 6 36 */
+   return d + c;   /* 1 37 total instructions*/
+#endif
+}
+
+#endif /* !CONFIG_XILINX_MICROBLAZE0_USE_HW_MUL */
+#endif /* _ASM_ARCHHASH_H */
-- 
2.8.1

[PATCH 10/10] h8300: Add

2016-05-25 Thread George Spelvin

This will improve the performance of hash_32() and hash_64(), but due
to complete lack of multi-bit shift instructions on H8, performance will
still be bad in surrounding code.

Designing H8-specific hash algorithms to work around that is a separate
project.  (But if the maintainers would like to get in touch...)

Signed-off-by: George Spelvin 
Cc: Yoshinori Sato 
Cc: uclinux-h8-de...@lists.sourceforge.jp
---
 arch/h8300/Kconfig|  1 +
 arch/h8300/include/asm/archhash.h | 52 +++
 2 files changed, 53 insertions(+)
 create mode 100644 arch/h8300/include/asm/archhash.h

diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index 986ea84c..6c583dbb 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -20,6 +20,7 @@ config H8300
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZO
select HAVE_ARCH_KGDB
+   select HAVE_ARCH_HASH
 
 config RWSEM_GENERIC_SPINLOCK
def_bool y
diff --git a/arch/h8300/include/asm/archhash.h 
b/arch/h8300/include/asm/archhash.h
new file mode 100644
index ..018ed96a
--- /dev/null
+++ b/arch/h8300/include/asm/archhash.h
@@ -0,0 +1,52 @@
+#ifndef _ASM_ARCHHASH_H
+#define _ASM_ARCHHASH_H
+
+/*
+ * The later H8SX models have a 32x32-bit multiply, but the H8/300H
+ * and H8S have only 16x16->32.  Since it's tolerably compact, this
+ * is basically an inlined version of the __mulsi3 code.  It's also
+ * simplfied by skipping the early-out checks.
+ *
+ * (Since neither CPU has any multi-bit shift instructions, a
+ * shift-and-add version is a non-starter.)
+ *
+ * TODO: come up with an arch-specific version of the hashing in fs/namei.c,
+ * since that is heavily dependent on rotates.  Which, as mentioned, suck
+ * horribly on H8.
+ */
+
+#if defined(CONFIG_CPU_H300H) || defined(CONFIG_CPU_H8S)
+
+#define HAVE_ARCH__HASH_32 1
+
+/*
+ * Multiply by k = 0x61C88647.  Fitting this into three registers requires
+ * one extra instruction, but reducing register pressure will probably
+ * make that back and then some.
+ *
+ * GCC asm note: %e1 is the high half of operand %1, while %f1 is the
+ * low half.  So if %1 is er4, then %e1 is e4 and %f1 is r4.
+ *
+ * This has been designed to modify x in place, since that's the most
+ * common usage, but preserve k, since hash_64() makes two calls
+ * in quick succession.
+ */
+static inline u32 __attribute_const__ __hash_32(u32 x)
+{
+   u32 temp;
+
+   asm(   "mov.w   %e1,%f0"
+   "\n mulxu.w %f2,%0" /* klow * xhigh */
+   "\n mov.w   %f0,%e1"/* The extra instruction */
+   "\n mov.w   %f1,%f0"
+   "\n mulxu.w %e2,%0" /* khigh * xlow */
+   "\n add.w   %e1,%f0"
+   "\n mulxu.w %f2,%1" /* klow * xlow */
+   "\n add.w   %f0,%e1"
+   : "=&r" (temp), "=r" (x)
+   : "%r" (GOLDEN_RATIO_32), "1" (x));
+   return x;
+}
+
+#endif /* CONFIG_ARCH_H300H */
+#endif /* _ASM_ARCHHASH_H */
-- 
2.8.1

[RESEND PATCH] ARM64: dts: rockchip: add thermal zone node for rk3399 SoCs

2016-05-25 Thread Caesar Wang

This adds thermal zone node to rk3399 dtsi, rk3399 thermal data is
including the cpu and gpu sensor zone node.

The thermal zone node is the node containing all the required info
for describing a thermal zone, including its cooling device bindings.
The thermal zone node must contain, apart from its own properties, one
sub-node containing trip nodes and one sub-node containing all the zone
cooling maps.

The following is the parameter is introduced:
* polling-delay:
The maximum number of milliseconds to wait between polls

* polling-delay-passive:
The maximum number of milliseconds to wait between polls when performing
passive cooling.

* trips:
A sub-node which is a container of only trip point nodes required to
describe the thermal zone.

* cooling-maps:
A sub-node which is a container of only cooling device map nodes, used to
describe the relation between trips and cooling devices.

* cooling-device:
A phandle of a cooling device with its specifier, referring to which
cooling device is used in this cooling specifier binding. In the cooling
specifier, the first cell is the minimum cooling state and the second cell
is the maximum cooling state used in this map.

Signed-off-by: Caesar Wang 
---

 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 100 +++
 1 file changed, 100 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 46f325a..f8a80c2 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
compatible = "rockchip,rk3399";
@@ -389,6 +390,95 @@
status = "disabled";
};
 
+   thermal-zones {
+   cpu_thermal: cpu {
+   polling-delay-passive = <100>; /* milliseconds */
+   polling-delay = <1000>; /* milliseconds */
+
+   thermal-sensors = <&tsadc 0>;
+
+   trips {
+   cpu_alert0: cpu_alert0 {
+   temperature = <7>; /* millicelsius 
*/
+   hysteresis = <2000>; /* millicelsius */
+   type = "passive";
+   };
+   cpu_alert1: cpu_alert1 {
+   temperature = <75000>; /* millicelsius 
*/
+   hysteresis = <2000>; /* millicelsius */
+   type = "passive";
+   };
+   cpu_crit: cpu_crit {
+   temperature = <95000>; /* millicelsius 
*/
+   hysteresis = <2000>; /* millicelsius */
+   type = "critical";
+   };
+   };
+
+   cooling-maps {
+   map0 {
+   trip = <&cpu_alert0>;
+   cooling-device =
+   <&cpu_b0 THERMAL_NO_LIMIT 
THERMAL_NO_LIMIT>;
+   };
+   map1 {
+   trip = <&cpu_alert1>;
+   cooling-device =
+   <&cpu_l0 THERMAL_NO_LIMIT 
THERMAL_NO_LIMIT>,
+   <&cpu_b0 THERMAL_NO_LIMIT 
THERMAL_NO_LIMIT>;
+   };
+   };
+   };
+
+   gpu_thermal: gpu {
+   polling-delay-passive = <100>; /* milliseconds */
+   polling-delay = <1000>; /* milliseconds */
+
+   thermal-sensors = <&tsadc 1>;
+
+   trips {
+   gpu_alert0: gpu_alert0 {
+   temperature = <75000>; /* millicelsius 
*/
+   hysteresis = <2000>; /* millicelsius */
+   type = "passive";
+   };
+   gpu_crit: gpu_crit {
+   temperature = <95000>; /* millicelsius 
*/
+   hysteresis = <2000>; /* millicelsius */
+   type = "critical";
+   };
+   };
+
+   cooling-maps {
+   map0 {
+   trip = <&gpu_alert0>;
+   cooling-device =
+   <&cpu_b0 THERMAL_NO_LIMIT 
THERMAL_NO_LIMIT>;
+   };
+

[PATCH v4] input: tablet: add Pegasus Notetaker tablet driver

2016-05-25 Thread Martin Kepplinger

This adds a driver for the Pegasus Notetaker Pen. When connected,
this uses the Pen as an input tablet.

This device was sold in various different brandings, for example
"Pegasus Mobile Notetaker M210",
"Genie e-note The Notetaker",
"Staedtler Digital ballpoint pen 990 01",
"IRISnotes Express" or
"NEWLink Digital Note Taker".

Here's an example, so that you know what we are talking about:
http://www.staedtler.com/en/products/ink-writing-instruments/ballpoint-pens/digital-pen-990-01-digital-ballpoint-pen

http://pegatech.blogspot.com/ seems to be a remaining official resource.

This device can also transfer saved (offline recorded handwritten) data and
there are userspace programs that do this, see https://launchpad.net/m210
(Well, alternatively there are really fast scanners out there :)

It's *really* fun to use as an input tablet though! So let's support this
for everybody.

There's no way to disable the device. When the pen is out of range, we just
don't get any URBs and don't do anything.
Like all other mouses or input tablets, we don't use runtime PM.

Signed-off-by: Martin Kepplinger 
---

Thanks for having a look. Any more suggestions on this?

revision history

v4 use normal work queue instead of a kernel thread (thanks to Oliver Neukum)
v3 fix reporting low pen battery and add USB list to CC
v2 minor cleanup (remove unnecessary variables)
v1 initial release



 drivers/input/tablet/Kconfig |  15 ++
 drivers/input/tablet/Makefile|   1 +
 drivers/input/tablet/pegasus_notetaker.c | 373 +++
 3 files changed, 389 insertions(+)
 create mode 100644 drivers/input/tablet/pegasus_notetaker.c

diff --git a/drivers/input/tablet/Kconfig b/drivers/input/tablet/Kconfig
index 623bb9e..a2b9f97 100644
--- a/drivers/input/tablet/Kconfig
+++ b/drivers/input/tablet/Kconfig
@@ -73,6 +73,21 @@ config TABLET_USB_KBTAB
  To compile this driver as a module, choose M here: the
  module will be called kbtab.
 
+config TABLET_USB_PEGASUS
+   tristate "Pegasus Mobile Notetaker Pen input tablet support"
+   depends on USB_ARCH_HAS_HCD
+   select USB
+   help
+ Say Y here if you want to use the Pegasus Mobile Notetaker,
+ also known as:
+ Genie e-note The Notetaker,
+ Staedtler Digital ballpoint pen 990 01,
+ IRISnotes Express or
+ NEWLink Digital Note Taker.
+
+ To compile this driver as a module, choose M here: the
+ module will be called pegasus_notetaker.
+
 config TABLET_SERIAL_WACOM4
tristate "Wacom protocol 4 serial tablet support"
select SERIO
diff --git a/drivers/input/tablet/Makefile b/drivers/input/tablet/Makefile
index 2e13010..200fc4e 100644
--- a/drivers/input/tablet/Makefile
+++ b/drivers/input/tablet/Makefile
@@ -8,4 +8,5 @@ obj-$(CONFIG_TABLET_USB_AIPTEK) += aiptek.o
 obj-$(CONFIG_TABLET_USB_GTCO)  += gtco.o
 obj-$(CONFIG_TABLET_USB_HANWANG) += hanwang.o
 obj-$(CONFIG_TABLET_USB_KBTAB) += kbtab.o
+obj-$(CONFIG_TABLET_USB_PEGASUS) += pegasus_notetaker.o
 obj-$(CONFIG_TABLET_SERIAL_WACOM4) += wacom_serial4.o
diff --git a/drivers/input/tablet/pegasus_notetaker.c 
b/drivers/input/tablet/pegasus_notetaker.c
new file mode 100644
index 000..b7bf585
--- /dev/null
+++ b/drivers/input/tablet/pegasus_notetaker.c
@@ -0,0 +1,373 @@
+/*
+ * Pegasus Mobile Notetaker Pen input tablet driver
+ *
+ * Copyright (c) 2016 Martin Kepplinger 
+ */
+
+/*
+ * request packet (control endpoint):
+ * |-|
+ * | Report ID | Nr of bytes | command   |
+ * | (1 byte)  | (1 byte)| (n bytes) |
+ * |-|
+ * | 0x02  | n   |   |
+ * |-|
+ *
+ * data packet after set xy mode command, 0x80 0xb5 0x02 0x01
+ * and pen is in range:
+ *
+ * bytebyte name   value (bits)
+ * 
+ * 0   status  0 1 0 0 0 0 X X
+ * 1   color   0 0 0 0 H 0 S T
+ * 2   X low
+ * 3   X high
+ * 4   Y low
+ * 5   Y high
+ *
+ * X X battery state:
+ * no state reported   0x00
+ * battery low 0x01
+ * battery good0x02
+ *
+ * H   Hovering
+ * S   Switch 1 (pen button)
+ * T   Tip
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+/* USB HID defines */
+#define USB_REQ_GET_REPORT 0x01
+#define USB_REQ_SET_REPORT 0x09
+
+#define USB_VENDOR_ID_PEGASUSTECH  0x0e20
+#define USB_DEVICE_ID_PEGASUS_NOTETAKER_EN100  0x0101
+
+/* device specific defines */
+#define NOTETAKER_REPORT_ID0x02
+#define NOTETAKER_SET_CMD  0x80
+#define NOTETAKER_SET_MODE 0xb5
+
+#define NOTETAKER_LED_MOUSE 0x02
+#define PEN_MODE_XY 0x01
+
+#define SPECIAL_COMMAND0x80
+#define BUTTON_PRESSED 0xb5
+#

Re: [PATCH 2/2] ARM: dts: Add async-bridge clock to MFC power domain for Exynos5420

2016-05-25 Thread Krzysztof Kozlowski

On 05/24/2016 07:41 PM, Javier Martinez Canillas wrote:
> The MFC IP is also inter-connected by an Async-Bridge so the CLK_ACLK333
> has to be ungated during a power domain switch. Trying to do it when the
> clock is gated will fail and lead to an imprecise external abort error
> when the driver tries to access the MFC registers with the PD disabled.
> 
> For example, if the s5p-mfc module is removed and the MFC PD turned off:
> 
> [  186.835606] Power domain power-domain@10044060 disable failed
> [  186.835671] s5p-mfc 1100.codec: Removing 1100.codec
> [  186.837670] Power domain power-domain@10044060 disable failed
> 
> And when the module is inserted again:
> 
> [ 2395.176956] s5p_mfc_wait_for_done_dev:34: Interrupt (dev->int_type:0, 
> command:12) timed out
> [ 2395.177031] s5p_mfc_init_hw:272: Failed to load firmware
> [ 2395.177384] Unhandled fault: imprecise external abort (0x1406) at 
> 0x
> [ 2395.177441] pgd = ec3b4000
> [ 2395.177467] [] *pgd=
> [ 2395.177507] Internal error: : 1406 [#1] PREEMPT SMP ARM
> [ 2395.177550] Modules linked in: s5p_mfc mwifiex_sdio mwifiex uvcvideo 
> s5p_jpeg v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops 
> videobuf2_v4l2 videobuf2_core v4l2_common videodev media [last unloaded: 
> s5p_mfc]
> [ 2395.14] CPU: 1 PID: 2382 Comm: v4l_id Tainted: GW   
> 4.6.0-rc6-next-20160502-00010-g7730dc64d2c1-dirty #179
> [ 2395.177857] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> [ 2395.177906] task: ed275500 ti: e6c8c000 task.ti: e6c8c000
> [ 2395.177996] PC is at s5p_mfc_reset+0x1c4/0x284 [s5p_mfc]
> [ 2395.178057] LR is at s5p_mfc_reset+0x1a4/0x284 [s5p_mfc]
> 
> This patch fixes this issue by adding the CLK_ACLK333 as an Async-Bridge
> clock for the MFC power domain, so the PD configuration works properly.
> 
> Signed-off-by: Javier Martinez Canillas 
> 
> ---
> 
>  arch/arm/boot/dts/exynos5420.dtsi | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Indeed patch #1 is not a hard dependency here because there are no other
asb clocks. It is entirely obvious but works fine.

Reviewed-by: Krzysztof Kozlowski 

Unless all other patches are meant to current fixes cycle (and/or
cc-stable), I do not plan to apply it now. I'll take it for v4.8, because:
1. Your previous patches are needed. Without them bind/unbind won't work.
2. This is not reproducible in a regular driver operation.
3. It needs clock change to actually be useful.

Is it okay?

Best regards,
Krzysztof

> 
> diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
> b/arch/arm/boot/dts/exynos5420.dtsi
> index 4c8523471c65..f3e9d873633e 100644
> --- a/arch/arm/boot/dts/exynos5420.dtsi
> +++ b/arch/arm/boot/dts/exynos5420.dtsi
> @@ -313,8 +313,9 @@
>   mfc_pd: power-domain@10044060 {
>   compatible = "samsung,exynos4210-pd";
>   reg = <0x10044060 0x20>;
> - clocks = <&clock CLK_FIN_PLL>, <&clock CLK_MOUT_USER_ACLK333>;
> - clock-names = "oscclk", "clk0";
> + clocks = <&clock CLK_FIN_PLL>, <&clock CLK_MOUT_USER_ACLK333>,
> +  <&clock CLK_ACLK333>;
> + clock-names = "oscclk", "clk0","asb0";
>   #power-domain-cells = <0>;
>   };
>  
>

Re: [PATCH 07/16] sched: Make SD_BALANCE_WAKE a topology flag

2016-05-25 Thread Yuyang Du

On Mon, May 23, 2016 at 11:58:49AM +0100, Morten Rasmussen wrote:
> For systems with the SD_ASYM_CPUCAPACITY flag set on higher level in the
> sched_domain hierarchy we need a way to enable wake-up balancing for the
> lower levels as well as we may want to balance tasks that don't fit the
> capacity of the previous cpu.
> 
> We have the option of introducing a new topology flag to express this
> requirement, or let the existing SD_BALANCE_WAKE flag be set by the
> architecture as a topology flag. The former means introducing yet
> another flag, the latter breaks the current meaning of topology flags.
> None of the options are really desirable.
 
I'd propose to replace SD_WAKE_AFFINE with SD_BALANCE_WAKE. And the
SD_WAKE_AFFINE semantic is simply "waker allowed":

waker_allowed = cpumask_test_cpu(cpu, tsk_cpus_allowed(p));

This can be implemented without current functionality change.

>From there, the choice between waker and wakee, and fast path
select_idle_sibling() and the rest slow path should be reworked, which
I am thinking about.

Re: [PATCH v2 5/5] usb: dwc3: rockchip: add devicetree bindings documentation

2016-05-25 Thread William Wu


Hi Felipe,

On 05/24/2016 05:32 PM, Felipe Balbi wrote:

Hi,

William Wu  writes:

This patch documents the device tree documentation required for
Rockchip USB3.0 core wrapper consist of USB3.0 IP from Synopsys.

It could operate in device mode (SS, HS, FS) and host
mode (SS, HS, FS, LS).

Signed-off-by: William Wu 
---
Changes in v2:
- add rockchip,dwc3.txt to Documentation/devicetree/bindings/ (Felipe, Brian)

  .../devicetree/bindings/usb/rockchip,dwc3.txt  | 45 ++
  1 file changed, 45 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/usb/rockchip,dwc3.txt

diff --git a/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt 
b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
new file mode 100644
index 000..10303d9
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
@@ -0,0 +1,45 @@
+Rockchip SuperSpeed DWC3 USB SoC controller
+
+Required properties:
+- compatible:  should contain "rockchip,dwc3"
+- clocks:  A list of phandle + clock-specifier pairs for the
+   clocks listed in clock-names
+- clock-names: Should contain the following:
+  "clk_usb3otg0_ref" Controller reference clk
+  "clk_usb3otg0_suspend"Controller suspend clk, can use 24 MHz or 32 KHz
+  "aclk_usb3"Master/Core clock, have to be >= 62.5 MHz for SS 
operation
+
+
+Optional clocks:
+  "aclk_usb3otg0"Aclk for specific usb controller clock.
+  "aclk_usb3_rksoc_axi_perf"  USB AXI perf clock.  Not present on all 
platforms.
+  "aclk_usb3_grf"USB grf clock.  Not present on all platforms.
+
+Required child node:
+A child node must exist to represent the core DWC3 IP block. The name of
+the node is not important. The content of the node is defined in dwc3.txt.
+
+Phy documentation is provided in the following places:
+
+Example device nodes:
+
+   usbdrd3_0: usb@fe80 {
+

no reg property?

For now, we don't need reg property here. Because we only need to do
enable some clocks and populate its children in 
drivers/usb/dwc3/dwc3-of-simple.c.

And it's similar to arch/arm/boot/dts/exynos5420.dtsi usbdrd3_0 node.

compatible = "rockchip,dwc3";

+   clocks = <&cru SCLK_USB3OTG0_REF>, <&cru SCLK_USB3OTG0_SUSPEND>,
+<&cru ACLK_USB3>, <&cru ACLK_USB3OTG0>,
+<&cru ACLK_USB3_RKSOC_AXI_PERF>, <&cru ACLK_USB3_GRF>;
+   clock-names = "clk_usb3otg0_ref", "clk_usb3otg0_suspend",
+ "aclk_usb3", "aclk_usb3otg0",
+ "aclk_usb3_rksoc_axi_perf", "aclk_usb3_grf";
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+   status = "disabled";
+   usbdrd_dwc3_0: dwc3 {

no address here?
I think here don't  necessarily need address. The child node dwc3 can 
inherit address from the parent node.

And with this dtsi patch, the dev path show as follows:
/sys/devices/platform/usb@fe80/fe80.dwc3

Is it need for coding style or other reason?




+   compatible = "snps,dwc3";
+   reg = <0x0 0xfe80 0x0 0x10>;
+   interrupts = ;
+   dr_mode = "otg";
+   status = "disabled";
+   };
+   };
--
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 2/2] ARM: EXYNOS: refactoring of mach-exynos to enable chipid driver

2016-05-25 Thread Pankaj Dubey

This patch enables chipid driver for ARCH_EXYNOS and refactors
machine code for using chipid driver for identification of
SoC ID and SoC rev.

Signed-off-by: Pankaj Dubey 
---
 arch/arm/mach-exynos/Kconfig |  1 +
 arch/arm/mach-exynos/common.h| 53 
 arch/arm/mach-exynos/exynos.c| 49 +
 arch/arm/mach-exynos/include/mach/map.h  |  2 --
 arch/arm/mach-exynos/platsmp.c   |  2 +-
 arch/arm/mach-exynos/pm.c|  8 ++---
 arch/arm/plat-samsung/cpu.c  | 14 
 arch/arm/plat-samsung/include/plat/cpu.h |  2 --
 arch/arm/plat-samsung/include/plat/map-s5p.h |  1 -
 9 files changed, 37 insertions(+), 95 deletions(-)

diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig
index 20dcf6e..f93c790 100644
--- a/arch/arm/mach-exynos/Kconfig
+++ b/arch/arm/mach-exynos/Kconfig
@@ -16,6 +16,7 @@ menuconfig ARCH_EXYNOS
select ARM_AMBA
select ARM_GIC
select COMMON_CLK_SAMSUNG
+   select EXYNOS_CHIPID
select EXYNOS_THERMAL
select EXYNOS_PMU
select EXYNOS_SROM
diff --git a/arch/arm/mach-exynos/common.h b/arch/arm/mach-exynos/common.h
index 5365bf1..566ad2b 100644
--- a/arch/arm/mach-exynos/common.h
+++ b/arch/arm/mach-exynos/common.h
@@ -13,39 +13,26 @@
 #define __ARCH_ARM_MACH_EXYNOS_COMMON_H
 
 #include 
+#include 
 
-#define EXYNOS3250_SOC_ID  0xE3472000
-#define EXYNOS3_SOC_MASK   0xF000
-
-#define EXYNOS4210_CPU_ID  0x4321
-#define EXYNOS4212_CPU_ID  0x4322
-#define EXYNOS4412_CPU_ID  0xE4412200
-#define EXYNOS4_CPU_MASK   0xFFFE
-
-#define EXYNOS5250_SOC_ID  0x4352
-#define EXYNOS5410_SOC_ID  0xE541
-#define EXYNOS5420_SOC_ID  0xE542
-#define EXYNOS5440_SOC_ID  0xE544
-#define EXYNOS5800_SOC_ID  0xE5422000
-#define EXYNOS5_SOC_MASK   0xF000
-
-extern unsigned long samsung_cpu_id;
+static inline u32 exynos_product_id(void);
 
 #define IS_SAMSUNG_CPU(name, id, mask) \
 static inline int is_samsung_##name(void)  \
 {  \
-   return ((samsung_cpu_id & mask) == (id & mask));\
+   u32 product_id = exynos_product_id();   \
+   return ((product_id & mask) == (id));   \
 }
 
-IS_SAMSUNG_CPU(exynos3250, EXYNOS3250_SOC_ID, EXYNOS3_SOC_MASK)
-IS_SAMSUNG_CPU(exynos4210, EXYNOS4210_CPU_ID, EXYNOS4_CPU_MASK)
-IS_SAMSUNG_CPU(exynos4212, EXYNOS4212_CPU_ID, EXYNOS4_CPU_MASK)
-IS_SAMSUNG_CPU(exynos4412, EXYNOS4412_CPU_ID, EXYNOS4_CPU_MASK)
-IS_SAMSUNG_CPU(exynos5250, EXYNOS5250_SOC_ID, EXYNOS5_SOC_MASK)
-IS_SAMSUNG_CPU(exynos5410, EXYNOS5410_SOC_ID, EXYNOS5_SOC_MASK)
-IS_SAMSUNG_CPU(exynos5420, EXYNOS5420_SOC_ID, EXYNOS5_SOC_MASK)
-IS_SAMSUNG_CPU(exynos5440, EXYNOS5440_SOC_ID, EXYNOS5_SOC_MASK)
-IS_SAMSUNG_CPU(exynos5800, EXYNOS5800_SOC_ID, EXYNOS5_SOC_MASK)
+IS_SAMSUNG_CPU(exynos3250, EXYNOS3250_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos4210, EXYNOS4210_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos4212, EXYNOS4212_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos4412, EXYNOS4412_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos5250, EXYNOS5250_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos5410, EXYNOS5410_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos5420, EXYNOS5420_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos5440, EXYNOS5440_SOC_ID, EXYNOS_SOC_MASK)
+IS_SAMSUNG_CPU(exynos5800, EXYNOS5800_SOC_ID, EXYNOS_SOC_MASK)
 
 #if defined(CONFIG_SOC_EXYNOS3250)
 # define soc_is_exynos3250()   is_samsung_exynos3250()
@@ -71,10 +58,6 @@ IS_SAMSUNG_CPU(exynos5800, EXYNOS5800_SOC_ID, 
EXYNOS5_SOC_MASK)
 # define soc_is_exynos4412()   0
 #endif
 
-#define EXYNOS4210_REV_0   (0x0)
-#define EXYNOS4210_REV_1_0 (0x10)
-#define EXYNOS4210_REV_1_1 (0x11)
-
 #if defined(CONFIG_SOC_EXYNOS5250)
 # define soc_is_exynos5250()   is_samsung_exynos5250()
 #else
@@ -172,6 +155,16 @@ extern void exynos_core_restart(u32 core_id);
 extern int exynos_set_boot_addr(u32 core_id, unsigned long boot_addr);
 extern int exynos_get_boot_addr(u32 core_id, unsigned long *boot_addr);
 
+static inline u32 exynos_product_id(void)
+{
+   return exynos_soc_info.product_id;
+}
+
+static inline u32 exynos_revision(void)
+{
+   return exynos_soc_info.revision;
+}
+
 static inline void pmu_raw_writel(u32 val, u32 offset)
 {
__raw_writel(val, pmu_base_addr + offset);
diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
index f977eea..89d7254 100644
--- a/arch/arm/mach-exynos/exynos.c
+++ b/arch/arm/mach-exynos/exynos.c
@@ -92,50 +92,11 @@ static void __init exynos_init_late(void)
exynos_pm_init();
 }
 
-static int __init exynos_fdt_map_chipid(unsigned long node, const char *uname,
-   int depth, void *data)
-{
-   struct map_desc iodesc;
-   const __be32 *reg;
-   int len;
-
-   if (!of_flat_dt_is_c

Re: [PATCH v1 00/10] * imx-sdma: misc fix *

2016-05-25 Thread Jiada Wang


Hello

On 05/17/2016 07:04 PM, Vinod Koul wrote:

On Tue, May 17, 2016 at 12:47:46PM +0900, Jiada Wang wrote:

this patch set contains the following changes
1. fix issues in cyclic dma
2. add support to SYNC DMA termination
3. avoid system hang, when SDMA channel 0 timeouts
4. add lock to prevent race condition


I have three series in my inbox with same title and version. whats going on?


Sorry for the confusion,
attempted to loop Shawn, but used wrong email address.

Thanks,
Jiada



Jiada Wang (10):
   dma: imx-sdma: use chn_real_count to report residue for UART
   dma: imx-sdma: don't update BD in isr routine
   dma: imx-sdma: clear BD_RROR flag before pass it to sdma script
   dma: imx-sdma: update sdma channel status for cyclic dma
   dma: imx-sdma: add flag to indicate SDMA channel state
   dma: imx-sdma: add terminate_all support
   dma: imx-sdma: Add synchronization support
   dma: imx-sdma: abort updating channel when it has been terminated
   dma: imx-sdma: disable channel 0 when it timeouts
   dma: imx-sdma: clear channel0 interrupt bit in irq routine

  drivers/dma/imx-sdma.c | 113 +++--
  1 file changed, 82 insertions(+), 31 deletions(-)

--
2.4.5

--
To unsubscribe from this list: send the line "unsubscribe dmaengine" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 0/2] Introducing Exynos ChipId driver

2016-05-25 Thread Pankaj Dubey

Once again I am respinning this quite old patch series to introduce 
Exynos Chipid driver.

This patch series introduces Exynos Chipid platform driver.
Each Exynos SoC has ChipID block which can give information about SoC's
product Id and revision number.
At the same time it reduces dependency of mach-exynos files from plat-samsung,
by removing samsung_rev API, similar API is introduced in chipid driver itself
to get revision number and product id. 

I have tested this patch series on Exynos5880 based Chromebook for
normal system boot and S2R.

Revisiion 5 and it's discussion can be found here
 - 
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-December/310046.html
Revision 4 and it's discussion can be found here
 - https://lkml.org/lkml/2014/12/3/115

Change since v5:
 - Addressed Rob's review comments.
 - Rebased on latest krzk/for-next branch and retested.

Changes since v4:
 - Removed custom sysfs entries as they were not providing any new information
   as pointed out by Arnd.
 - Removed functions exporting product_id and revision, instead we will export
   exynos_chipid_info structure. It will be helpfull when we need to provide 
more
   fields of chipid outside of chipid, as commented by Yadwinder
 - Converted all funcions as __init. 

Change since v3: 
 - This patch set contains 5/6 and 6/6 patch from v3 series.
 - Made EXYNOS_CHIPID config option non-user selectable,
   as suggested by Tomasz Figa.
 - Made uniform macro for EXYNOS4/5_SOC_MASK as EXYNOS_SOC_MASK as
   suggested by Tomasz Figa.
 - Made local variables static in chipid driver.
 - Added existing SoC's product id's.
 - Added platform driver support.

Changes since v2:
 - Reorganized patches as suggested by Tomasz Figa.
 - Addressed review comments of Tomasz Figa in i2c-s3c2410.c file.

Changes since v1:
 - Added patch to move i2c interrupt re-configuration code from exynos.c
   to i2c driver, as suggested by Arnd.
 - After above patch only user of SYS_I2C_CFG register is pm.c so moving
   save/restore of this register also into i2c driver.
 - Spiltted up exynos4 and exynos5 machine descriptors to get rid from
   soc_is_exynos4/exynos5 kind of macros, as suggested by Arnd.
 - Changed location of chipid driver to "drivers/soc".
 - Added drivers/base/soc.c provided infrastructure to make SoC specific 
   information avaible to user space via sysfs entry, as suggested by Arnd.

Pankaj Dubey (2):
  soc: samsung: add exynos chipid driver support
  ARM: EXYNOS: refactoring of mach-exynos to enable chipid driver

 arch/arm/mach-exynos/Kconfig |   1 +
 arch/arm/mach-exynos/common.h|  53 -
 arch/arm/mach-exynos/exynos.c|  49 ++--
 arch/arm/mach-exynos/include/mach/map.h  |   2 -
 arch/arm/mach-exynos/platsmp.c   |   2 +-
 arch/arm/mach-exynos/pm.c|   8 +-
 arch/arm/plat-samsung/cpu.c  |  14 ---
 arch/arm/plat-samsung/include/plat/cpu.h |   2 -
 arch/arm/plat-samsung/include/plat/map-s5p.h |   1 -
 drivers/soc/samsung/Kconfig  |   5 +
 drivers/soc/samsung/Makefile |   1 +
 drivers/soc/samsung/exynos-chipid.c  | 172 +++
 include/linux/soc/samsung/exynos-soc.h   |  51 
 13 files changed, 266 insertions(+), 95 deletions(-)
 create mode 100644 drivers/soc/samsung/exynos-chipid.c
 create mode 100644 include/linux/soc/samsung/exynos-soc.h

-- 
2.4.5

[PATCH v6 1/2] soc: samsung: add exynos chipid driver support

2016-05-25 Thread Pankaj Dubey

Exynos SoCs have Chipid, for identification of product IDs
and SoC revisions. This patch intends to provide initialization
code for all these functionalities, at the same time it provides some
sysfs entries for accessing these information to user-space.

This driver uses existing binding for exynos-chipid.

CC: Grant Likely 
CC: Rob Herring 
CC: Linus Walleij 
Signed-off-by: Pankaj Dubey 
---
 drivers/soc/samsung/Kconfig|   5 +
 drivers/soc/samsung/Makefile   |   1 +
 drivers/soc/samsung/exynos-chipid.c| 172 +
 include/linux/soc/samsung/exynos-soc.h |  51 ++
 4 files changed, 229 insertions(+)
 create mode 100644 drivers/soc/samsung/exynos-chipid.c
 create mode 100644 include/linux/soc/samsung/exynos-soc.h

diff --git a/drivers/soc/samsung/Kconfig b/drivers/soc/samsung/Kconfig
index d7fc123..fc793f3 100644
--- a/drivers/soc/samsung/Kconfig
+++ b/drivers/soc/samsung/Kconfig
@@ -10,4 +10,9 @@ config EXYNOS_PMU
bool "Exynos PMU controller driver" if COMPILE_TEST
depends on (ARM && ARCH_EXYNOS) || ((ARM || ARM64) && COMPILE_TEST)
 
+config EXYNOS_CHIPID
+   bool "Exynos Chipid controller driver" if COMPILE_TEST
+   depends on (ARM && ARCH_EXYNOS) || ((ARM || ARM64) && COMPILE_TEST)
+   select SOC_BUS
+
 endif
diff --git a/drivers/soc/samsung/Makefile b/drivers/soc/samsung/Makefile
index f64ac4d..81023ed 100644
--- a/drivers/soc/samsung/Makefile
+++ b/drivers/soc/samsung/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_EXYNOS_PMU)   += exynos-pmu.o exynos3250-pmu.o exynos4-pmu.o \
exynos5250-pmu.o exynos5420-pmu.o
+obj-$(CONFIG_EXYNOS_CHIPID)+= exynos-chipid.o
diff --git a/drivers/soc/samsung/exynos-chipid.c 
b/drivers/soc/samsung/exynos-chipid.c
new file mode 100644
index 000..fa20fdd
--- /dev/null
+++ b/drivers/soc/samsung/exynos-chipid.c
@@ -0,0 +1,172 @@
+/*
+ * Copyright (c) 2016 Samsung Electronics Co., Ltd.
+ *   http://www.samsung.com/
+ *
+ * EXYNOS - CHIP ID support
+ * Author: Pankaj Dubey 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define EXYNOS_SUBREV_MASK (0xF << 4)
+#define EXYNOS_MAINREV_MASK(0xF << 0)
+#define EXYNOS_REV_MASK(EXYNOS_SUBREV_MASK | 
EXYNOS_MAINREV_MASK)
+
+static void __iomem *exynos_chipid_base;
+
+struct exynos_chipid_info exynos_soc_info;
+EXPORT_SYMBOL(exynos_soc_info);
+
+static const char * __init product_id_to_name(unsigned int product_id)
+{
+   const char *soc_name;
+   unsigned int soc_id = product_id & EXYNOS_SOC_MASK;
+
+   switch (soc_id) {
+   case EXYNOS3250_SOC_ID:
+   soc_name = "EXYNOS3250";
+   break;
+   case EXYNOS4210_SOC_ID:
+   soc_name = "EXYNOS4210";
+   break;
+   case EXYNOS4212_SOC_ID:
+   soc_name = "EXYNOS4212";
+   break;
+   case EXYNOS4412_SOC_ID:
+   soc_name = "EXYNOS4412";
+   break;
+   case EXYNOS4415_SOC_ID:
+   soc_name = "EXYNOS4415";
+   break;
+   case EXYNOS5250_SOC_ID:
+   soc_name = "EXYNOS5250";
+   break;
+   case EXYNOS5260_SOC_ID:
+   soc_name = "EXYNOS5260";
+   break;
+   case EXYNOS5420_SOC_ID:
+   soc_name = "EXYNOS5420";
+   break;
+   case EXYNOS5440_SOC_ID:
+   soc_name = "EXYNOS5440";
+   break;
+   case EXYNOS5800_SOC_ID:
+   soc_name = "EXYNOS5800";
+   break;
+   default:
+   soc_name = "UNKNOWN";
+   }
+   return soc_name;
+}
+
+static const struct of_device_id of_exynos_chipid_ids[] = {
+   {
+   .compatible = "samsung,exynos4210-chipid",
+   },
+   {},
+};
+
+/**
+ *  exynos_chipid_early_init: Early chipid initialization
+ *  @dev: pointer to chipid device
+ */
+int __init exynos_chipid_early_init(struct device *dev)
+{
+   struct device_node *np;
+   const struct of_device_id *match;
+
+   if (exynos_chipid_base)
+   return 0;
+
+   if (!dev)
+   np = of_find_matching_node_and_match(NULL,
+   of_exynos_chipid_ids, &match);
+   else
+   np = dev->of_node;
+
+   if (!np)
+   return -ENODEV;
+
+   exynos_chipid_base = of_iomap(np, 0);
+
+   if (!exynos_chipid_base)
+   return PTR_ERR(exynos_chipid_base);
+   
+   exynos_soc_info.product_id  = __raw_readl(exynos_chipid_base);
+   exynos_soc_info.revision = exynos_soc_info.product_id & EXYNOS_REV_MASK;
+
+   return 0;
+}
+
+static int __init exynos_chipid_probe(struct platform_device *pdev)
+{
+

Re: [PATCH v2 4/5] iommu/mediatek: add support for mtk iommu generation one HW

2016-05-25 Thread Honghui Zhang

On Tue, 2016-05-24 at 16:36 +0100, Robin Murphy wrote:
> On 24/05/16 10:57, Honghui Zhang wrote:
> [...]
> >>> @@ -48,6 +48,9 @@ struct mtk_iommu_domain {
> >>>   struct io_pgtable_ops   *iop;
> >>>
> >>>   struct iommu_domain domain;
> >>> + void*pgt_va;
> >>> + dma_addr_t  pgt_pa;
> >>> + void*cookie;
> >>
> >> These are going to be mutually exclusive with the cfg and iop members,
> >> which implies it might be a good idea to use a union and not waste
> >> space. Or better, just forward-declare struct mtk_iommu_domain here and
> >> leave separate definitions private to each driver. The void *cookie is
> >> also an unnecessary level of abstraction, I think.
> >>
> >
> > Do you mean declare struct mtk_iommu_domain here, and implement a new
> > struct in mtk_iommu_v1.c like
> > struct mtk_iommu_domain_v1 {
> > struct mtk_iommu_domain domain;
> > u32 *pgt_va;
> > dma_addr_t  pgt_pa;
> > mtk_iommu_data  *data;
> > };
> > If this is acceptable I would implement it in the next version.
> 
> Pretty much, except they both want to be called struct mtk_iommu_domain, 
> so that a *declaration* for the sake of the m4u_dom member of struct 
> mtk_iommu_data in the header file can remain common to both drivers - it 
> then just picks up whichever private *definition* from the .c file being 
> compiled.

I will follow your advise in the next version, thanks very much.

> 
> >>>};
> >>>
> >>>struct mtk_iommu_data {
> >>> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> >>> new file mode 100644
> >>> index 000..55023e1
> >>> --- /dev/null
> >>> +++ b/drivers/iommu/mtk_iommu_v1.c
> >>> @@ -0,0 +1,742 @@
> >>> +/*
> >>> + * Copyright (c) 2015-2016 MediaTek Inc.
> >>> + * Author: Yong Wu 
> >>
> >> Nit: is that in the sense that this patch should also have Yong's
> >> signed-off-by on it, or in that it's your work derived from his version
> >> in mtk_iommu.c?
> >
> > I write this driver based on Yong's version of mtk_iommu.c, should I add
> > his signed-off-by for this patch? Or should I put a comment about this?
> > Thanks.
> 
> OK, in that case I think the appropriate attribution would be along the 
> lines of "Author: Honghui Zhang, based on mtk_iommu.c by Yong Wu" (if in 
> doubt, grepping for "Based on" gives a feel for how this is commonly 
> done). If the work that comprises this patch itself (i.e. the copying 
> and modification of the existing code) is all yours then your sign-off 
> alone is fine.
> 
> [...]
> >>> +static int mtk_iommu_add_device(struct device *dev)
> >>> +{
> >>> + struct iommu_group *group;
> >>> + struct device_node *np;
> >>> + struct of_phandle_args iommu_spec;
> >>> + int idx = 0;
> >>> +
> >>> + while (!of_parse_phandle_with_args(dev->of_node, "iommus",
> >>> +"#iommu-cells", idx,
> >>> +&iommu_spec)) {
> >>
> >> Hang on, this doesn't seem right - why do you need to reimplement all
> >> this instead of using IOMMU_OF_DECLARE()?
> >
> > All the clients of mtk generation one iommu share the same iommu domain,
> > as a matter of fact, mtk generation one iommu only support one iommu
> > domain. ALl the clients share the same iova address and use the same
> > pagetable. That means all iommu clients needed to be attached to the
> > same dma_iommu_mapping.
> 
> Ugh, right - I'd forgotten that the arch/arm DMA mapping code doesn't 
> respect IOMMU groups or default domains at all. That's the real root 
> cause of the issue here.
> 
> > If use IOMMU_OF_DELCARE, we need to call of_iommu_set_ops to set the
> > iommu_ops, I do not want the iommu_ops be set since it would cause iommu
> > client device in different dma_iommu_mapping.
> >
> > When an iommu client device has been created, the following sequence is
> > called.
> >
> > of_platform_device_create
> > ->of_dma_config
> > ->arch_setup_dma_ops
> > ->arch_setup_iommu_dma_ops
> > In this function of arch_setup_iommu_dma_ops would create a new
> > dma_iommu_mapping for each iommu client device and then attach the
> > device to this new dma_iommu_mapping. Since all the iommu clients share
> > the very same pagetable, this will not workable for our HW.
> > I could not release the dma_iommu_mapping in attach_device since the
> > to_dma_iommu_mapping was set after device_attached.
> > Any suggest for this?
> 
> On a second look, you're doing more or less the same thing that the 
> Renesas IPMMU driver currently does, so it's probably OK as a workaround 
> for now. Fixing the arch/arm code is part of the bigger ongoing problem 
> of sorting out IOMMU probing and DMA configuration, and it doesn't seem 
> fair to force that on you for the sake of one driver ;)
> 

Yes, I did read the IPMMU driver before I coding this driver. Thanks.

> [...]
> >>> +static int __may

Re: [PATCH 00/10] String hash improvements

2016-05-25 Thread Geert Uytterhoeven

Hi George,

On Wed, May 25, 2016 at 9:20 AM, George Spelvin
 wrote:
> I'm not particularly fond of the names of the header files I created,
> but if anyone has a better idea please talk fast!

Usually this is handled through include/asm-generic/.
Put the generic default implementation in include/asm-generic/hash.h.

Architectures that need to override provide their own version, e.g.
arch/m68k/include/asm/hash.h. They may #include 
if they still want to reuse parts of the generic implementation.

Other architectures add "generic-y += hash.h" to their
arch//include/asm/Kbuild.

 includes  t.

>  arch/h8300/include/asm/archhash.h  |  52 
>  arch/m68k/include/asm/archhash.h   |  67 +++
>  arch/microblaze/include/asm/archhash.h |  80 ++
>  include/linux/hash.h   | 111 
>  include/linux/stringhash.h |  76 +

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCHv3 0/2] target: make location of /var/targets configurable

2016-05-25 Thread Zhu Lingshan


Hi experts,

I think these patches are great, and I am ready to help in user space.

Thanks,
BR
Zhu Lingshan

On 05/09/2016 09:17 AM, Lee Duncan wrote:

On 04/14/2016 06:18 PM, Lee Duncan wrote:

These patches make the location of "/var/target" configurable,
though it still defauls to "/var/target".

This "target database directory" can only be changed
after the target_core_mod loads but before any
fabric drivers are loaded, and must be the pathname
of an existing directory.

This configuration is accomplished via the configfs
top-level target attribute "dbroot", i.e. dumping
out "/sys/kernel/config/target/dbroot" will normally
return "/var/target". Writing to this attribute
changes the loation where the kernel looks for the
target database.

The first patch creates this configurable value for
the "dbroot", and the second patch modifies users
of this directory to use this new attribute.

Changes from v2:
  * Add locking around access to target driver list

Changes from v1:
  * Only allow changing target DB root before it
can be used by others
  * Validate that new DB root is a valid directory

Lee Duncan (2):
   target: make target db location configurable
   target: use new "dbroot" target attribute

  drivers/target/target_core_alua.c |  6 ++--
  drivers/target/target_core_configfs.c | 62 +++
  drivers/target/target_core_internal.h |  6 
  drivers/target/target_core_pr.c   |  2 +-
  4 files changed, 72 insertions(+), 4 deletions(-)


Ping?

Re: [PATCH 2/2] ARM: dts: Add async-bridge clock to MFC power domain for Exynos5420

2016-05-25 Thread Krzysztof Kozlowski

On 05/25/2016 09:48 AM, Krzysztof Kozlowski wrote:
> On 05/24/2016 07:41 PM, Javier Martinez Canillas wrote:
>> The MFC IP is also inter-connected by an Async-Bridge so the CLK_ACLK333
>> has to be ungated during a power domain switch. Trying to do it when the
>> clock is gated will fail and lead to an imprecise external abort error
>> when the driver tries to access the MFC registers with the PD disabled.
>>
>> For example, if the s5p-mfc module is removed and the MFC PD turned off:
>>
>> [  186.835606] Power domain power-domain@10044060 disable failed
>> [  186.835671] s5p-mfc 1100.codec: Removing 1100.codec
>> [  186.837670] Power domain power-domain@10044060 disable failed
>>
>> And when the module is inserted again:
>>
>> [ 2395.176956] s5p_mfc_wait_for_done_dev:34: Interrupt (dev->int_type:0, 
>> command:12) timed out
>> [ 2395.177031] s5p_mfc_init_hw:272: Failed to load firmware
>> [ 2395.177384] Unhandled fault: imprecise external abort (0x1406) at 
>> 0x
>> [ 2395.177441] pgd = ec3b4000
>> [ 2395.177467] [] *pgd=
>> [ 2395.177507] Internal error: : 1406 [#1] PREEMPT SMP ARM
>> [ 2395.177550] Modules linked in: s5p_mfc mwifiex_sdio mwifiex uvcvideo 
>> s5p_jpeg v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig 
>> videobuf2_memops videobuf2_v4l2 videobuf2_core v4l2_common videodev media 
>> [last unloaded: s5p_mfc]
>> [ 2395.14] CPU: 1 PID: 2382 Comm: v4l_id Tainted: GW   
>> 4.6.0-rc6-next-20160502-00010-g7730dc64d2c1-dirty #179
>> [ 2395.177857] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
>> [ 2395.177906] task: ed275500 ti: e6c8c000 task.ti: e6c8c000
>> [ 2395.177996] PC is at s5p_mfc_reset+0x1c4/0x284 [s5p_mfc]
>> [ 2395.178057] LR is at s5p_mfc_reset+0x1a4/0x284 [s5p_mfc]
>>
>> This patch fixes this issue by adding the CLK_ACLK333 as an Async-Bridge
>> clock for the MFC power domain, so the PD configuration works properly.
>>
>> Signed-off-by: Javier Martinez Canillas 
>>
>> ---
>>
>>  arch/arm/boot/dts/exynos5420.dtsi | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Indeed patch #1 is not a hard dependency here because there are no other
> asb clocks. It is entirely obvious but works fine.

Damn, I wanted to write:
"It is not entirely obvious but works fine."
(in Exynos pm_domains driver the clk_get() returns -ENOENT and the loop
is escaped early)

BR,
Krzysztof

> 
> Reviewed-by: Krzysztof Kozlowski 
> 
> Unless all other patches are meant to current fixes cycle (and/or
> cc-stable), I do not plan to apply it now. I'll take it for v4.8, because:
> 1. Your previous patches are needed. Without them bind/unbind won't work.
> 2. This is not reproducible in a regular driver operation.
> 3. It needs clock change to actually be useful.
> 
> Is it okay?
> 
> Best regards,
> Krzysztof
> 
>>
>> diff --git a/arch/arm/boot/dts/exynos5420.dtsi 
>> b/arch/arm/boot/dts/exynos5420.dtsi
>> index 4c8523471c65..f3e9d873633e 100644
>> --- a/arch/arm/boot/dts/exynos5420.dtsi
>> +++ b/arch/arm/boot/dts/exynos5420.dtsi
>> @@ -313,8 +313,9 @@
>>  mfc_pd: power-domain@10044060 {
>>  compatible = "samsung,exynos4210-pd";
>>  reg = <0x10044060 0x20>;
>> -clocks = <&clock CLK_FIN_PLL>, <&clock CLK_MOUT_USER_ACLK333>;
>> -clock-names = "oscclk", "clk0";
>> +clocks = <&clock CLK_FIN_PLL>, <&clock CLK_MOUT_USER_ACLK333>,
>> + <&clock CLK_ACLK333>;
>> +clock-names = "oscclk", "clk0","asb0";
>>  #power-domain-cells = <0>;
>>  };
>>  
>>
> 
>

Re: [PATCH v2 5/5] usb: dwc3: rockchip: add devicetree bindings documentation

2016-05-25 Thread Felipe Balbi


Hi,

William Wu  writes:
> Hi Felipe,
>
> On 05/24/2016 05:32 PM, Felipe Balbi wrote:
>> Hi,
>>
>> William Wu  writes:
>>> This patch documents the device tree documentation required for
>>> Rockchip USB3.0 core wrapper consist of USB3.0 IP from Synopsys.
>>>
>>> It could operate in device mode (SS, HS, FS) and host
>>> mode (SS, HS, FS, LS).
>>>
>>> Signed-off-by: William Wu 
>>> ---
>>> Changes in v2:
>>> - add rockchip,dwc3.txt to Documentation/devicetree/bindings/ (Felipe, 
>>> Brian)
>>>
>>>   .../devicetree/bindings/usb/rockchip,dwc3.txt  | 45 
>>> ++
>>>   1 file changed, 45 insertions(+)
>>>   create mode 100644 Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt 
>>> b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
>>> new file mode 100644
>>> index 000..10303d9
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
>>> @@ -0,0 +1,45 @@
>>> +Rockchip SuperSpeed DWC3 USB SoC controller
>>> +
>>> +Required properties:
>>> +- compatible:  should contain "rockchip,dwc3"
>>> +- clocks:  A list of phandle + clock-specifier pairs for the
>>> +   clocks listed in clock-names
>>> +- clock-names: Should contain the following:
>>> +  "clk_usb3otg0_ref"   Controller reference clk
>>> +  "clk_usb3otg0_suspend"Controller suspend clk, can use 24 MHz or 32 KHz
>>> +  "aclk_usb3"  Master/Core clock, have to be >= 62.5 MHz for 
>>> SS operation
>>> +
>>> +
>>> +Optional clocks:
>>> +  "aclk_usb3otg0"  Aclk for specific usb controller clock.
>>> +  "aclk_usb3_rksoc_axi_perf"  USB AXI perf clock.  Not present on all 
>>> platforms.
>>> +  "aclk_usb3_grf"  USB grf clock.  Not present on all platforms.
>>> +
>>> +Required child node:
>>> +A child node must exist to represent the core DWC3 IP block. The name of
>>> +the node is not important. The content of the node is defined in dwc3.txt.
>>> +
>>> +Phy documentation is provided in the following places:
>>> +
>>> +Example device nodes:
>>> +
>>> +   usbdrd3_0: usb@fe80 {
>>> +
>> no reg property?
> For now, we don't need reg property here. Because we only need to do
> enable some clocks and populate its children in 
> drivers/usb/dwc3/dwc3-of-simple.c.
> And it's similar to arch/arm/boot/dts/exynos5420.dtsi usbdrd3_0 node.
>>  compatible = "rockchip,dwc3";
>>> +   clocks = <&cru SCLK_USB3OTG0_REF>, <&cru SCLK_USB3OTG0_SUSPEND>,
>>> +<&cru ACLK_USB3>, <&cru ACLK_USB3OTG0>,
>>> +<&cru ACLK_USB3_RKSOC_AXI_PERF>, <&cru ACLK_USB3_GRF>;
>>> +   clock-names = "clk_usb3otg0_ref", "clk_usb3otg0_suspend",
>>> + "aclk_usb3", "aclk_usb3otg0",
>>> + "aclk_usb3_rksoc_axi_perf", "aclk_usb3_grf";
>>> +   #address-cells = <2>;
>>> +   #size-cells = <2>;
>>> +   ranges;
>>> +   status = "disabled";
>>> +   usbdrd_dwc3_0: dwc3 {
>> no address here?
> I think here don't  necessarily need address. The child node dwc3 can 
> inherit address from the parent node.
> And with this dtsi patch, the dev path show as follows:
> /sys/devices/platform/usb@fe80/fe80.dwc3
>
> Is it need for coding style or other reason?

I don't think your arguments match what devicetree folks want to see in
DT. Let's ask them. Rob, care to look at this one?

>
>>
>>> +   compatible = "snps,dwc3";
>>> +   reg = <0x0 0xfe80 0x0 0x10>;
>>> +   interrupts = ;
>>> +   dr_mode = "otg";
>>> +   status = "disabled";
>>> +   };
>>> +   };
>>> -- 
>>> 1.9.1
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
balbi


signature.asc
Description: PGP signature

Re: [PATCH 08/10] m68k: Add

2016-05-25 Thread Geert Uytterhoeven

Hi George,

On Wed, May 25, 2016 at 9:34 AM, George Spelvin
 wrote:
> This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
> for the original mc68000, which lacks a 32x32-bit multiply instruction.
>
> Yes, the amount of optimization effort put in is excessive. :-)
>
> Addition chains found by Yevgen Voronenko's Hcub algorithm at
> http://spiral.ece.cmu.edu/mcm/gen.html
>
> Signed-off-by: George Spelvin 
> Cc: Geert Uytterhoeven 
> Cc: Greg Ungerer 
> Cc: linux-m...@lists.linux-m68k.org
> ---
>  arch/m68k/Kconfig|  1 +
>  arch/m68k/include/asm/archhash.h | 67 
> 
>  2 files changed, 68 insertions(+)
>  create mode 100644 arch/m68k/include/asm/archhash.h
>
> diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
> index 498b567f..95197d5e 100644
> --- a/arch/m68k/Kconfig
> +++ b/arch/m68k/Kconfig
> @@ -23,6 +23,7 @@ config M68K
> select MODULES_USE_ELF_RELA
> select OLD_SIGSUSPEND3
> select OLD_SIGACTION
> +   select HAVE_ARCH_HASH

"select HAVE_ARCH_HASH if M68000"?

Or better, move the select to the M68000 section in arch/m68k/Kconfig.cpu.

> --- /dev/null
> +++ b/arch/m68k/include/asm/archhash.h
> @@ -0,0 +1,67 @@
> +#ifndef _ASM_ARCHHASH_H
> +#define _ASM_ARCHHASH_H
> +
> +/*
> + * The only 68k processors that lack MULU.L and so need this workaround
> + * are the original 68000 and 68010.
> + *
> + * Annoyingly, GCC defines __mc68000 for all processors in the family;
> + * the only way to identify an mc68000 is by the *absence* of other
> + * symbols; __mcpu32, __mcoldfire__, __mc68020, etc.
> + */
> +#if ! (defined(__mc68020) || \
> +   defined(__mc68030) || \
> +   defined(__mc68040) || \
> +   defined(__mc68060) || \
> +   defined(__mcpu32)  || \
> +   defined(__mcoldfire))

With my comment above, you wouldn't need this, but I'm gonna comment anyway.

We don't use special GCCs to target specific CPU variants. Hence inside the
kernel, you should check the config symbols, to see if support for 68000 or
68010 (which isn't supported by the kernel yet) is enabled.

Hence the check should be:

#if defined(CONFIG_M68000) || defined(CONFIG_M68010)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH] mm: oom_kill_process: do not abort if the victim is exiting

2016-05-25 Thread Michal Hocko

On Tue 24-05-16 20:07:46, Vladimir Davydov wrote:
> On Tue, May 24, 2016 at 03:50:42PM +0200, Michal Hocko wrote:
[...]
> > It is not really pointless. The original intention was to not spam the
> > log and alarm the administrator when in fact the memory hog is exiting
> > already and will free the memory.
> 
> IMO the fact that a process, even an exiting one enters oom, is
> abnormal, indicates that the system is misconfigured, and hence should
> be reported to the admin.
> 
> > Those races is quite unlikely but not impossible.
> 
> If this case is unlikely, how can it spam the log?

The oom report can be quite large, especially on large setups. The
oom_reaper message will be much shorter and will give a clue that
an exceptional action had to be done.

> > The original check was much more optimistic as you said
> > above we have even removed one part of this heuristic. We can still end
> > up selecting an exiting task which is stuck and we could invoke the oom
> > reaper for it without excessive oom report. I agree that the current
> > check is still little bit optimistic but processes sharing the mm
> > (CLONE_VM without CLONE_THREAD/CLONE_SIGHAND) are really rare so I
> > wouldn't bother with them with a high priority.
> > 
> > That being said I would prefer to keep the check for now. After the
> > merge windlow closes I will send other oom enhancements which I have
> > half baked locally and that should make task_will_free_mem much more
> > reliable and the check would serve as a last resort to reduce oom noise.
> 
> I don't agree that a message about oom killing an exiting process is
> noise, because that shouldn't happen on a properly configured system.
> To me this racy check looks more like noise in the kernel code. By the
> time we enter oom we should have scanned lru several times to find no
> reclaimable pages. The system must be really sluggish. What's the point
> in deceiving the admin by suppressing the warning?

Well, my understanding of the OOM report is that it should tell you two
things. The first one is to give you an overview of the overal memory
situation when the system went OOM and the second one is o give you
information that something has been _killed_ and what was the criteria
why it has been selected (points). While the first one might be
interesting for what you write above the second is not and it might be
even misleading because we are not killing anything and the selected
task is dying without the kernel intervention. So I dunno. I do not see
any strong reason to drop these few lines of code which should be a
maintenance burden. task_will_free_mem will need some changes to be more
robust anyway. If you really see a strong reason to drop it because it
would help to debug OOM situation then I won't insist...
-- 
Michal Hocko
SUSE Labs

Re: [PATCH 00/10] String hash improvements

2016-05-25 Thread George Spelvin

Geert Uytterhoeven wrote:
> Usually this is handled through include/asm-generic/.
> Put the generic default implementation in include/asm-generic/hash.h.
>
> Architectures that need to override provide their own version, e.g.
> arch/m68k/include/asm/hash.h. They may #include 
> if they still want to reuse parts of the generic implementation.
>
> Other architectures add "generic-y += hash.h" to their
> arch//include/asm/Kbuild.

I thought about that, but then I'd have to edit *every* architecture,
and might need acks from all the maintainers.

I was looking for something that was a total no-op on most architectures.

But if this is preferred, it's not technically difficult at all.

If asm-generic were in the  search path, it would magically
Just Work, but leftover files from a broken checkout would be a big
potential problem.

[PATCH V3 2/2] i2c: qup: Fix error handling

2016-05-25 Thread Sricharan R

Among the bus errors reported from the QUP_MASTER_STATUS register
only NACK is considered and transfer gets suspended, while
other errors are ignored. Correct this and suspend the transfer
for other errors as well. This avoids unnecessary 'timeouts' which
happens when waiting for events that would never happen when there
is already an error condition on the bus. Also the error handling
procedure should be the same for both NACK and other bus errors in
case of dma mode. So correct that as well.

Signed-off-by: Sricharan R 
---
[V3] Change the return error code for NACK and other bus errors.
 Added the error handling procedure for other bus errors in
 dma mode. Removed the dev_err print in case of NACK.

 drivers/i2c/busses/i2c-qup.c | 76 
 1 file changed, 41 insertions(+), 35 deletions(-)

diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
index 8620e99..edced86 100644
--- a/drivers/i2c/busses/i2c-qup.c
+++ b/drivers/i2c/busses/i2c-qup.c
@@ -310,6 +310,7 @@ static int qup_i2c_wait_ready(struct qup_i2c_dev *qup, int 
op, bool val,
u32 opflags;
u32 status;
u32 shift = __ffs(op);
+   int ret = 0;
 
len *= qup->one_byte_t;
/* timeout after a wait of twice the max time */
@@ -321,18 +322,28 @@ static int qup_i2c_wait_ready(struct qup_i2c_dev *qup, 
int op, bool val,
 
if (((opflags & op) >> shift) == val) {
if ((op == QUP_OUT_NOT_EMPTY) && qup->is_last) {
-   if (!(status & I2C_STATUS_BUS_ACTIVE))
-   return 0;
+   if (!(status & I2C_STATUS_BUS_ACTIVE)) {
+   ret = 0;
+   goto done;
+   }
} else {
-   return 0;
+   ret = 0;
+   goto done;
}
}
 
-   if (time_after(jiffies, timeout))
-   return -ETIMEDOUT;
-
+   if (time_after(jiffies, timeout)) {
+   ret = -ETIMEDOUT;
+   goto done;
+   }
usleep_range(len, len * 2);
}
+
+done:
+   if (qup->bus_err || qup->qup_err)
+   ret =  (qup->bus_err & QUP_I2C_NACK_FLAG) ? -ENXIO : -EIO;
+
+   return ret;
 }
 
 static void qup_i2c_set_write_mode_v2(struct qup_i2c_dev *qup,
@@ -793,39 +804,35 @@ static int qup_i2c_bam_do_xfer(struct qup_i2c_dev *qup, 
struct i2c_msg *msg,
}
 
if (ret || qup->bus_err || qup->qup_err) {
-   if (qup->bus_err & QUP_I2C_NACK_FLAG) {
-   msg--;
-   dev_err(qup->dev, "NACK from %x\n", msg->addr);
-   ret = -EIO;
+   if (qup_i2c_change_state(qup, QUP_RUN_STATE)) {
+   dev_err(qup->dev, "change to run state timed out");
+   goto desc_err;
+   }
 
-   if (qup_i2c_change_state(qup, QUP_RUN_STATE)) {
-   dev_err(qup->dev, "change to run state timed 
out");
-   return ret;
-   }
+   if (rx_nents)
+   writel(QUP_BAM_INPUT_EOT,
+  qup->base + QUP_OUT_FIFO_BASE);
 
-   if (rx_nents)
-   writel(QUP_BAM_INPUT_EOT,
-  qup->base + QUP_OUT_FIFO_BASE);
+   writel(QUP_BAM_FLUSH_STOP, qup->base + QUP_OUT_FIFO_BASE);
 
-   writel(QUP_BAM_FLUSH_STOP,
-  qup->base + QUP_OUT_FIFO_BASE);
+   qup_i2c_flush(qup);
 
-   qup_i2c_flush(qup);
+   /* wait for remaining interrupts to occur */
+   if (!wait_for_completion_timeout(&qup->xfer, HZ))
+   dev_err(qup->dev, "flush timed out\n");
 
-   /* wait for remaining interrupts to occur */
-   if (!wait_for_completion_timeout(&qup->xfer, HZ))
-   dev_err(qup->dev, "flush timed out\n");
+   qup_i2c_rel_dma(qup);
 
-   qup_i2c_rel_dma(qup);
-   }
+   ret =  (qup->bus_err & QUP_I2C_NACK_FLAG) ? -ENXIO : -EIO;
}
 
+desc_err:
dma_unmap_sg(qup->dev, qup->btx.sg, tx_nents, DMA_TO_DEVICE);
 
if (rx_nents)
dma_unmap_sg(qup->dev, qup->brx.sg, rx_nents,
 DMA_FROM_DEVICE);
-desc_err:
+
return ret;
 }
 
@@ -841,9 +848,6 @@ static int qup_i2c_bam_xfer(struct i2c_adapter *adap, 
struct i2c_msg *msg,
if (ret)
goto out;
 
-   qup->bus_err = 0;
-   qup->qup_err = 0;
-

[PATCH 2/2] ARM: at91: Add DT support for Olimex SAM9-L9260 board.

2016-05-25 Thread Raashid Muhammed

From: Raashid Muhammed 

sam9-l9260 is a low cost board designed by Olimex.

More information is available at:
https://www.olimex.com/Products/ARM/Atmel/SAM9-L9260/

Signed-off-by: Raashid Muhammed 
Reviewed-by: Vijay Kumar B. 
---
 Documentation/devicetree/bindings/arm/olimex.txt |   8 +-
 arch/arm/boot/dts/Makefile   |   1 +
 arch/arm/boot/dts/at91-sam9_l9260.dts| 115 +++
 3 files changed, 122 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm/boot/dts/at91-sam9_l9260.dts

diff --git a/Documentation/devicetree/bindings/arm/olimex.txt 
b/Documentation/devicetree/bindings/arm/olimex.txt
index 007fb5c..d726aec 100644
--- a/Documentation/devicetree/bindings/arm/olimex.txt
+++ b/Documentation/devicetree/bindings/arm/olimex.txt
@@ -1,5 +1,9 @@
-Olimex i.MX Platforms Device Tree Bindings
---
+Olimex Device Tree Bindings
+---
+
+SAM9-L9260 Board
+Required root node properties:
+- compatible = "olimex,sam9-l9260", "atmel,at91sam9260";
 
 i.MX23 Olinuxino Low Cost Board
 Required root node properties:
diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 0f89d87..d27f09d 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -19,6 +19,7 @@ dtb-$(CONFIG_SOC_SAM_V4_V5) += \
usb_a9260.dtb \
at91sam9261ek.dtb \
at91sam9263ek.dtb \
+   at91-sam9_l9260.dtb \
tny_a9263.dtb \
usb_a9263.dtb \
at91-foxg20.dtb \
diff --git a/arch/arm/boot/dts/at91-sam9_l9260.dts 
b/arch/arm/boot/dts/at91-sam9_l9260.dts
new file mode 100644
index 000..e2da9b906
--- /dev/null
+++ b/arch/arm/boot/dts/at91-sam9_l9260.dts
@@ -0,0 +1,115 @@
+/*
+ * at91-sam9_l9260.dts - Device Tree file for Olimex SAM9-L9260 board
+ *
+ *  Copyright (C) 2016 Raashid Muhammed 
+ *
+ * Licensed under GPLv2 or later.
+ */
+/dts-v1/;
+#include "at91sam9260.dtsi"
+
+/ {
+   model = "Olimex sam9-l9260";
+   compatible = "olimex,sam9-l9260", "atmel,at91sam9260", "atmel,at91sam9";
+
+   chosen {
+   stdout-path = "serial0:115200n8";
+};
+
+memory {
+   reg = <0x2000 0x400>;
+};
+
+clocks {
+   slow_xtal {
+   clock-frequency = <32768>;
+   };
+
+   main_xtal {
+   clock-frequency = <18432000>;
+   };
+};
+
+ahb {
+   apb {
+   mmc0: mmc@fffa8000 {
+   pinctrl-0 = <
+   &pinctrl_board_mmc0
+   &pinctrl_mmc0_clk
+   &pinctrl_mmc0_slot1_cmd_dat0
+   &pinctrl_mmc0_slot1_dat1_3>;
+   status = "okay";
+
+   slot@1 {
+   reg = <1>;
+   bus-width = <4>;
+   cd-gpios = <&pioC 8 GPIO_ACTIVE_HIGH>;
+   wp-gpios = <&pioC 4 GPIO_ACTIVE_HIGH>;
+   };
+   };
+
+   macb0: ethernet@fffc4000 {
+   phy-mode = "mii";
+   #address-cells = <1>;
+   #size-cells = <0>;
+   status = "okay";
+
+   ethernet-phy@1 {
+   reg = <0x1>;
+   };
+};
+
+   spi0: spi@fffc8000 {
+   cs-gpios = <&pioC 11 0>, <0>, <0>, <0>;
+   status = "okay";
+
+   mtd_dataflash@0 {
+   compatible = "atmel,at45", 
"atmel,dataflash";
+   spi-max-frequency = <1500>;
+   reg = <0>;
+   };
+   };
+
+   dbgu: serial@f200 {
+   status = "okay";
+};
+
+   pinctrl@f400 {
+   mmc0 {
+   pinctrl_board_mmc0: mmc0-board {
+   atmel,pins =
+   ;/* WP pin */
+   };
+   };
+   };
+};
+
+   nand0: nand@4000 {
+  nand-bus-width = <8>;
+  nand-ecc-mode = "soft";
+  nand-on-flash-bbt = <1>;
+  status = "okay";
+};
+
+

[PATCH V3 1/2] i2c: qup: Fix broken dma when CONFIG_DEBUG_SG is enabled

2016-05-25 Thread Sricharan R

With CONFIG_DEBUG_SG is enabled and when dma mode is used, below dump is seen,

[ cut here ]
kernel BUG at include/linux/scatterlist.h:140!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-00459-g9f087b9-dirty #7
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
task: ffc036868000 ti: ffc03687 task.ti: ffc03687
PC is at qup_sg_set_buf.isra.13+0x138/0x154
LR is at qup_sg_set_buf.isra.13+0x50/0x154
pc : [] lr : [] pstate: 6145
sp : ffc0368735c0
x29: ffc0368735c0 x28: ffc036873752
x27: ffc035233018 x26: ffc000c4e000
x25:  x24: 0004
x23:  x22: ffc035233668
x21: ff80004e3000 x20: ffc0352e0018
x19: 0040 x18: 0028
x17: 0004 x16: ffc0017a39c8
x15: 1cdf x14: ffc0019929d8
x13: ffc0352e0018 x12: 
x11: 0001 x10: 0001
x9 : ffc0012b2d70 x8 : ff80004e3000
x7 : 0018 x6 : 3000
x5 : ffc00199f018 x4 : ffc035233018
x3 : 0004 x2 : c000
x1 : 0003 x0 : 

Process swapper/0 (pid: 1, stack limit = 0xffc036870020)
Stack: (0xffc0368735c0 to 0xffc036874000)

sg_set_buf expects that the buf parameter passed in should be from
lowmem and a valid pageframe. This is not true for pages from
dma_alloc_coherent which can be carveouts, hence the check fails.
Change allocation of sg buffers from dma_coherent memory to kzalloc
to fix the issue.

Signed-off-by: Sricharan R 
Reviewed-by: Andy Gross 
Tested-by: Naveen Kaje 
---
[V3] Added more commit description.

 drivers/i2c/busses/i2c-qup.c | 53 ++--
 1 file changed, 17 insertions(+), 36 deletions(-)

diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
index 23eaabb..8620e99 100644
--- a/drivers/i2c/busses/i2c-qup.c
+++ b/drivers/i2c/busses/i2c-qup.c
@@ -585,8 +585,8 @@ static void qup_i2c_bam_cb(void *data)
 }
 
 static int qup_sg_set_buf(struct scatterlist *sg, void *buf,
- struct qup_i2c_tag *tg, unsigned int buflen,
- struct qup_i2c_dev *qup, int map, int dir)
+ unsigned int buflen, struct qup_i2c_dev *qup,
+ int dir)
 {
int ret;
 
@@ -595,9 +595,6 @@ static int qup_sg_set_buf(struct scatterlist *sg, void *buf,
if (!ret)
return -EINVAL;
 
-   if (!map)
-   sg_dma_address(sg) = tg->addr + ((u8 *)buf - tg->start);
-
return 0;
 }
 
@@ -670,16 +667,15 @@ static int qup_i2c_bam_do_xfer(struct qup_i2c_dev *qup, 
struct i2c_msg *msg,
/* scratch buf to read the start and len tags */
ret = qup_sg_set_buf(&qup->brx.sg[rx_buf++],
 &qup->brx.tag.start[0],
-&qup->brx.tag,
-2, qup, 0, 0);
+2, qup, DMA_FROM_DEVICE);
 
if (ret)
return ret;
 
ret = qup_sg_set_buf(&qup->brx.sg[rx_buf++],
 &msg->buf[limit * i],
-NULL, tlen, qup,
-1, DMA_FROM_DEVICE);
+tlen, qup,
+DMA_FROM_DEVICE);
if (ret)
return ret;
 
@@ -688,7 +684,7 @@ static int qup_i2c_bam_do_xfer(struct qup_i2c_dev *qup, 
struct i2c_msg *msg,
}
ret = qup_sg_set_buf(&qup->btx.sg[tx_buf++],
 &qup->start_tag.start[off],
-&qup->start_tag, len, qup, 0, 0);
+len, qup, DMA_TO_DEVICE);
if (ret)
return ret;
 
@@ -696,8 +692,7 @@ static int qup_i2c_bam_do_xfer(struct qup_i2c_dev *qup, 
struct i2c_msg *msg,
/* scratch buf to read the BAM EOT and FLUSH tags */
ret = qup_sg_set_buf(&qup->brx.sg[rx_buf++],
 &qup->brx.tag.start[0],
-&qup->brx.tag, 2,
-qup, 0, 0);
+2, qup, DMA_FROM_DEVICE);
if (ret)
return ret;
} else {
@@ -709,17 +704,15 @@ static

[PATCH 1/2] ARM: at91/dt: at91sam9260: Remove leading zeros in OHCI node.

2016-05-25 Thread Raashid Muhammed

From: Raashid Muhammed 

Remove leading zeros in OHCI node for at91sam9260 based boards.

Signed-off-by: Raashid Muhammed 
Reviewed-by: Vijay Kumar B. 
---
 arch/arm/boot/dts/aks-cdu.dts   | 2 +-
 arch/arm/boot/dts/animeo_ip.dts | 2 +-
 arch/arm/boot/dts/at91-foxg20.dts   | 2 +-
 arch/arm/boot/dts/at91-kizbox.dts   | 2 +-
 arch/arm/boot/dts/at91-qil_a9260.dts| 2 +-
 arch/arm/boot/dts/at91sam9260.dtsi  | 2 +-
 arch/arm/boot/dts/at91sam9g20ek_common.dtsi | 2 +-
 arch/arm/boot/dts/ethernut5.dts | 2 +-
 arch/arm/boot/dts/evk-pro3.dts  | 2 +-
 arch/arm/boot/dts/usb_a9260_common.dtsi | 2 +-
 10 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/arm/boot/dts/aks-cdu.dts b/arch/arm/boot/dts/aks-cdu.dts
index d9c50fb..5b1bf92 100644
--- a/arch/arm/boot/dts/aks-cdu.dts
+++ b/arch/arm/boot/dts/aks-cdu.dts
@@ -57,7 +57,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/animeo_ip.dts b/arch/arm/boot/dts/animeo_ip.dts
index 0962f2f..8bcbad3 100644
--- a/arch/arm/boot/dts/animeo_ip.dts
+++ b/arch/arm/boot/dts/animeo_ip.dts
@@ -114,7 +114,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
atmel,vbus-gpio = <&pioB 15 GPIO_ACTIVE_LOW>;
status = "okay";
diff --git a/arch/arm/boot/dts/at91-foxg20.dts 
b/arch/arm/boot/dts/at91-foxg20.dts
index 6bf873e..42a535d 100644
--- a/arch/arm/boot/dts/at91-foxg20.dts
+++ b/arch/arm/boot/dts/at91-foxg20.dts
@@ -128,7 +128,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/at91-kizbox.dts 
b/arch/arm/boot/dts/at91-kizbox.dts
index 229e989..0e3f34e 100644
--- a/arch/arm/boot/dts/at91-kizbox.dts
+++ b/arch/arm/boot/dts/at91-kizbox.dts
@@ -54,7 +54,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <1>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/at91-qil_a9260.dts 
b/arch/arm/boot/dts/at91-qil_a9260.dts
index 4f2eebf..5309e19 100644
--- a/arch/arm/boot/dts/at91-qil_a9260.dts
+++ b/arch/arm/boot/dts/at91-qil_a9260.dts
@@ -111,7 +111,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/at91sam9260.dtsi 
b/arch/arm/boot/dts/at91sam9260.dtsi
index d4884dd..af5ba31 100644
--- a/arch/arm/boot/dts/at91sam9260.dtsi
+++ b/arch/arm/boot/dts/at91sam9260.dtsi
@@ -1007,7 +1007,7 @@
status = "disabled";
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
compatible = "atmel,at91rm9200-ohci", "usb-ohci";
reg = <0x0050 0x10>;
interrupts = <20 IRQ_TYPE_LEVEL_HIGH 2>;
diff --git a/arch/arm/boot/dts/at91sam9g20ek_common.dtsi 
b/arch/arm/boot/dts/at91sam9g20ek_common.dtsi
index e9cc99b..1395b30 100644
--- a/arch/arm/boot/dts/at91sam9g20ek_common.dtsi
+++ b/arch/arm/boot/dts/at91sam9g20ek_common.dtsi
@@ -170,7 +170,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/ethernut5.dts b/arch/arm/boot/dts/ethernut5.dts
index 2430443..298ee2f 100644
--- a/arch/arm/boot/dts/ethernut5.dts
+++ b/arch/arm/boot/dts/ethernut5.dts
@@ -77,7 +77,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/evk-pro3.dts b/arch/arm/boot/dts/evk-pro3.dts
index f72969e..312d3e8 100644
--- a/arch/arm/boot/dts/evk-pro3.dts
+++ b/arch/arm/boot/dts/evk-pro3.dts
@@ -46,7 +46,7 @@
};
};
 
-   usb0: ohci@0050 {
+   usb0: ohci@50 {
num-ports = <2>;
status = "okay";
};
diff --git a/arch/arm/boot/dts/usb_a9260_common.dtsi 
b/arch/arm/boot/dts/usb_a9260_common.dtsi
index 9beea89..12119b5 100644
--- a/arch/arm/boot/dts/usb_a9260_common.dtsi
++

RE: [PATCH v3 1/2] i2c: qup: add ACPI support

2016-05-25 Thread Sricharan

Hi,

>From: Naveen Kaje 
>
>Add support to get the device parameters from ACPI. Assume
>that the clocks are managed by firmware.
>
>Signed-off-by: Naveen Kaje 
>Signed-off-by: Austin Christ 
>---
> drivers/i2c/busses/i2c-qup.c | 60 
> 1 file changed, 44 insertions(+), 16 deletions(-)
>
>Changes:
>- v3:
> - clean up unused variable
>- v2:
> - clean up redundant checks and variables
>
>diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
>index 4da..ea6ca5f 100644
>--- a/drivers/i2c/busses/i2c-qup.c
>+++ b/drivers/i2c/busses/i2c-qup.c
>@@ -29,6 +29,7 @@
> #include 
> #include 
> #include 
>+#include 
>
> /* QUP Registers */
> #define QUP_CONFIG0x000
>@@ -132,6 +133,10 @@
> /* Max timeout in ms for 32k bytes */
> #define TOUT_MAX  300
>
>+/* Default values. Use these if FW query fails */
>+#define DEFAULT_CLK_FREQ 10
>+#define DEFAULT_SRC_CLK 2000
>+
> struct qup_i2c_block {
>   int count;
>   int pos;
>@@ -1354,14 +1359,13 @@ static void qup_i2c_disable_clocks(struct qup_i2c_dev 
>*qup)
> static int qup_i2c_probe(struct platform_device *pdev)
> {
>   static const int blk_sizes[] = {4, 16, 32};
>-  struct device_node *node = pdev->dev.of_node;
>   struct qup_i2c_dev *qup;
>   unsigned long one_bit_t;
>   struct resource *res;
>   u32 io_mode, hw_ver, size;
>   int ret, fs_div, hs_div;
>-  int src_clk_freq;
>-  u32 clk_freq = 10;
>+  u32 src_clk_freq = 0;
>+  u32 clk_freq = 0;
>   int blocks;
>
>   qup = devm_kzalloc(&pdev->dev, sizeof(*qup), GFP_KERNEL);
>@@ -1372,7 +1376,12 @@ static int qup_i2c_probe(struct platform_device *pdev)
>   init_completion(&qup->xfer);
>   platform_set_drvdata(pdev, qup);
>
>-  of_property_read_u32(node, "clock-frequency", &clk_freq);
>+  ret = device_property_read_u32(qup->dev, "clock-frequency", &clk_freq);
>+  if (ret) {
>+  dev_warn(qup->dev, "using default clock-frequency %d",
>+  DEFAULT_CLK_FREQ);
>+  clk_freq = DEFAULT_CLK_FREQ;
>+  }
>
>   if (of_device_is_compatible(pdev->dev.of_node, "qcom,i2c-qup-v1.1.1")) {
>   qup->adap.algo = &qup_i2c_algo;
>@@ -1454,20 +1463,31 @@ nodma:
>   return qup->irq;
>   }
>
>-  qup->clk = devm_clk_get(qup->dev, "core");
>-  if (IS_ERR(qup->clk)) {
>-  dev_err(qup->dev, "Could not get core clock\n");
>-  return PTR_ERR(qup->clk);
>-  }
>+  if (ACPI_HANDLE(qup->dev)) {
>+  ret = device_property_read_u32(qup->dev,
>+  "src-clock-hz", &src_clk_freq);
>+  if (ret) {
>+  dev_warn(qup->dev, "using default src-clock-hz %d",
>+  DEFAULT_SRC_CLK);
>+  src_clk_freq = DEFAULT_SRC_CLK;
>+  }
>+  ACPI_COMPANION_SET(&qup->adap.dev, ACPI_COMPANION(qup->dev));
>+  } else {
>+  qup->clk = devm_clk_get(qup->dev, "core");
>+  if (IS_ERR(qup->clk)) {
>+  dev_err(qup->dev, "Could not get core clock\n");
>+  return PTR_ERR(qup->clk);
>+  }
>
>-  qup->pclk = devm_clk_get(qup->dev, "iface");
>-  if (IS_ERR(qup->pclk)) {
>-  dev_err(qup->dev, "Could not get iface clock\n");
>-  return PTR_ERR(qup->pclk);
>+  qup->pclk = devm_clk_get(qup->dev, "iface");
>+  if (IS_ERR(qup->pclk)) {
>+  dev_err(qup->dev, "Could not get iface clock\n");
>+  return PTR_ERR(qup->pclk);
>+  }
>+  qup_i2c_enable_clocks(qup);
>+  src_clk_freq = clk_get_rate(qup->clk);
>   }
>
>-  qup_i2c_enable_clocks(qup);
>-
>   /*
>* Bootloaders might leave a pending interrupt on certain QUP's,
>* so we reset the core before registering for interrupts.
>@@ -1514,7 +1534,6 @@ nodma:
>   size = QUP_INPUT_FIFO_SIZE(io_mode);
>   qup->in_fifo_sz = qup->in_blk_sz * (2 << size);
>
>-  src_clk_freq = clk_get_rate(qup->clk);
>   fs_div = ((src_clk_freq / clk_freq) / 2) - 3;
>   hs_div = 3;
>   qup->clk_ctl = (hs_div << 8) | (fs_div & 0xff);
>@@ -1639,6 +1658,14 @@ static const struct of_device_id qup_i2c_dt_match[] = {
> };
> MODULE_DEVICE_TABLE(of, qup_i2c_dt_match);
>
>+#if IS_ENABLED(CONFIG_ACPI)
>+static const struct acpi_device_id qup_i2c_acpi_match[] = {
>+  { "QCOM8010"},
>+  { },
>+};
>+MODULE_DEVICE_TABLE(acpi, qup_i2c_acpi_ids);
>+#endif
>+
> static struct platform_driver qup_i2c_driver = {
>   .probe  = qup_i2c_probe,
>   .remove = qup_i2c_remove,
>@@ -1646,6 +1673,7 @@ static struct platform_driver qup_i2c_driver = {
>   .name = "i2c_qup",
>   .pm = &qup_i2c_qup_pm_ops,
>   .of_match_table = qup_i2c_dt_match,
>+  .acpi_mat

[PATCH V3 0/2] i2c: qup: Some misc fixes

2016-05-25 Thread Sricharan R

One for fixing the bug with CONFIG_DEBUG_SG enabled and another
to suspend the transfer for all errors instead of just for NACK.

[V3] Added more commit description. Return more appropriate
 error codes for NACK and other bus errors. Corrected
 other bus errors handling procedure for dma mode as well.
 Removed the dev_err log for NACKs.

[V2] Removed the use of unnecessary variable assignment.

Kept the reviewed and Tested by tag for patch#1,
as there was no code change.

Depends on patch[1] for the error handling to be complete.

[1] https://lkml.org/lkml/2016/5/9/447

Sricharan R (2):
  i2c: qup: Fix broken dma when CONFIG_DEBUG_SG is enabled
  i2c: qup: Fix error handling

 drivers/i2c/busses/i2c-qup.c | 129 +++
 1 file changed, 58 insertions(+), 71 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

RE: [PATCH v3 2/2] i2c: qup: support SMBus block read

2016-05-25 Thread Sricharan

Hi,

>From: Naveen Kaje 
>
>I2C QUP driver relies on SMBus emulation support from the framework.
>To handle SMBus block reads, the driver should check I2C_M_RECV_LEN
>flag and should read the first byte received as the message length.
>
>The driver configures the QUP hardware to read one byte. Once the
>message length is known from this byte, the QUP hardware is configured
>to read the rest.
>
>Signed-off-by: Naveen Kaje 
>Signed-off-by: Austin Christ 
>---
> drivers/i2c/busses/i2c-qup.c | 68 ++--
> 1 file changed, 65 insertions(+), 3 deletions(-)
>
>Changes:
>- v3:
> - clean up redundant checks
> - use constant instead of variable for smbus length field
>- v2:
> - rework the smbus block read and break into separate function
>
>diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
>index ea6ca5f..9fbed83 100644
>--- a/drivers/i2c/busses/i2c-qup.c
>+++ b/drivers/i2c/busses/i2c-qup.c
>@@ -517,6 +517,33 @@ static int qup_i2c_get_data_len(struct qup_i2c_dev *qup)
>   return data_len;
> }
>
>+static bool qup_i2c_check_msg_len(struct i2c_msg *msg)
>+{
>+  return ((msg->flags & I2C_M_RD) && (msg->flags & I2C_M_RECV_LEN));
>+}
>+
>+static int qup_i2c_set_tags_smb(u16 addr, u8 *tags, struct qup_i2c_dev *qup,
>+  struct i2c_msg *msg)
>+{
>+  int len = 0;
>+
>+  if (msg->len > 1) {
>+  tags[len++] = QUP_TAG_V2_DATARD_STOP;
>+  tags[len++] = qup_i2c_get_data_len(qup) - 1;
>+  } else {
>+  tags[len++] = QUP_TAG_V2_START;
>+  tags[len++] = addr & 0xff;
>+
>+  if (msg->flags & I2C_M_TEN)
>+  tags[len++] = addr >> 8;
>+
>+  tags[len++] = QUP_TAG_V2_DATARD;
>+  /* Read 1 byte indicating the length of the SMBus message */
>+  tags[len++] = 1;
>+  }
>+  return len;
>+}
>+
> static int qup_i2c_set_tags(u8 *tags, struct qup_i2c_dev *qup,
>   struct i2c_msg *msg,  int is_dma)
> {
>@@ -526,6 +553,10 @@ static int qup_i2c_set_tags(u8 *tags, struct qup_i2c_dev 
>*qup,
>
>   int last = (qup->blk.pos == (qup->blk.count - 1)) && (qup->is_last);
>
>+  /* Handle tags for SMBus block read */
>+  if (qup_i2c_check_msg_len(msg))
>+  return qup_i2c_set_tags_smb(addr, tags, qup, msg);
>+
>   if (qup->blk.pos == 0) {
>   tags[len++] = QUP_TAG_V2_START;
>   tags[len++] = addr & 0xff;
>@@ -1065,9 +1096,17 @@ static int qup_i2c_read_fifo_v2(struct qup_i2c_dev *qup,
>   struct i2c_msg *msg)
> {
>   u32 val;
>-  int idx, pos = 0, ret = 0, total;
>+  int idx, pos = 0, ret = 0, total, msg_offset = 0;
>
>+  /*
>+   * If the message length is already read in
>+   * the first byte of the buffer, account for
>+   * that by setting the offset
>+   */
>+  if (qup_i2c_check_msg_len(msg) && (msg->len > 1))
>+  msg_offset = 1;
>   total = qup_i2c_get_data_len(qup);
>+  total -= msg_offset;
>
>   /* 2 extra bytes for read tags */
>   while (pos < (total + 2)) {
>@@ -1087,8 +1126,8 @@ static int qup_i2c_read_fifo_v2(struct qup_i2c_dev *qup,
>
>   if (pos >= (total + 2))
>   goto out;
>-
>-  msg->buf[qup->pos++] = val & 0xff;
>+  msg->buf[qup->pos + msg_offset] = val & 0xff;
>+  qup->pos++;
>   }
>   }
>
>@@ -1128,6 +1167,24 @@ static int qup_i2c_read_one_v2(struct qup_i2c_dev *qup, 
>struct i2c_msg *msg)
>   goto err;
>
>   qup->blk.pos++;
>+
>+  /* Handle SMBus block read length */
>+  if (qup_i2c_check_msg_len(msg) && (msg->len == 1)) {
>+  if (msg->buf[0] > I2C_SMBUS_BLOCK_MAX) {
>+  ret = -EPROTO;
>+  goto err;
>+  }
>+  msg->len += msg->buf[0];
>+  qup->pos = 0;
>+  qup_i2c_set_read_mode_v2(qup, msg->len);
>+  ret = qup_i2c_issue_xfer_v2(qup, msg);
>+  if (ret)
>+  goto err;
>+  ret = qup_i2c_wait_for_complete(qup, msg);
>+  if (ret)
>+  goto err;
>+  qup_i2c_set_blk_data(qup, msg);
>+  }
>   } while (qup->blk.pos < qup->blk.count);
>
> err:
>@@ -1210,6 +1267,11 @@ static int qup_i2c_xfer(struct i2c_adapter *adap,
>   goto out;
>   }
>
>+  if (qup_i2c_check_msg_len(&msgs[idx])) {
>+  ret = -EOPNOTSUPP;
>+  goto out;
>+  }
>+
>   if (msgs[idx].flags & I2C_M_RD)
>   ret = qup_i2c_read_one(qup, &msgs[idx]);
>   else

Reviewed-by: sricha..

Re: [PATCH 1/3] clk: samsung: exynos5433: prepare for adding CPU clocks

2016-05-25 Thread Krzysztof Kozlowski

On 05/24/2016 03:19 PM, Bartlomiej Zolnierkiewicz wrote:
> Open-code samsung_cmu_register_one() calls for CMU_APOLLO and
> CMU_ATLAS setup code as a preparation for adding CPU clocks
> support for Exynos5433.
> 
> There should be no functional change resulting from this patch.
> 
> Cc: Kukjin Kim 
> CC: Krzysztof Kozlowski 
> Signed-off-by: Bartlomiej Zolnierkiewicz 
> ---
>  drivers/clk/samsung/clk-exynos5433.c | 85 
> +++-
>  drivers/clk/samsung/clk.c| 12 ++---
>  drivers/clk/samsung/clk.h|  4 ++
>  3 files changed, 65 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/clk/samsung/clk-exynos5433.c 
> b/drivers/clk/samsung/clk-exynos5433.c
> index 128527b..6dd81ed 100644
> --- a/drivers/clk/samsung/clk-exynos5433.c
> +++ b/drivers/clk/samsung/clk-exynos5433.c
> @@ -11,6 +11,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -3594,23 +3595,35 @@ static struct samsung_gate_clock apollo_gate_clks[] 
> __initdata = {
>   CLK_IGNORE_UNUSED | CLK_SET_RATE_PARENT, 0),
>  };
>  
> -static struct samsung_cmu_info apollo_cmu_info __initdata = {
> - .pll_clks   = apollo_pll_clks,
> - .nr_pll_clks= ARRAY_SIZE(apollo_pll_clks),
> - .mux_clks   = apollo_mux_clks,
> - .nr_mux_clks= ARRAY_SIZE(apollo_mux_clks),
> - .div_clks   = apollo_div_clks,
> - .nr_div_clks= ARRAY_SIZE(apollo_div_clks),
> - .gate_clks  = apollo_gate_clks,
> - .nr_gate_clks   = ARRAY_SIZE(apollo_gate_clks),
> - .nr_clk_ids = APOLLO_NR_CLK,
> - .clk_regs   = apollo_clk_regs,
> - .nr_clk_regs= ARRAY_SIZE(apollo_clk_regs),
> -};
> -
>  static void __init exynos5433_cmu_apollo_init(struct device_node *np)
>  {
> - samsung_cmu_register_one(np, &apollo_cmu_info);
> + void __iomem *reg_base;
> + struct samsung_clk_provider *ctx;
> +
> + reg_base = of_iomap(np, 0);
> + if (!reg_base) {
> + panic("%s: failed to map registers\n", __func__);
> + return;
> + }
> +
> + ctx = samsung_clk_init(np, reg_base, APOLLO_NR_CLK);
> + if (!ctx) {
> + panic("%s: unable to allocate ctx\n", __func__);
> + return;
> + }
> +
> + samsung_clk_register_pll(ctx, apollo_pll_clks,
> +  ARRAY_SIZE(apollo_pll_clks), reg_base);
> + samsung_clk_register_mux(ctx, apollo_mux_clks,
> +  ARRAY_SIZE(apollo_mux_clks));
> + samsung_clk_register_div(ctx, apollo_div_clks,
> +  ARRAY_SIZE(apollo_div_clks));
> + samsung_clk_register_gate(ctx, apollo_gate_clks,
> +   ARRAY_SIZE(apollo_gate_clks));
> + samsung_clk_sleep_init(reg_base, apollo_clk_regs,
> +ARRAY_SIZE(apollo_clk_regs));
> +
> + samsung_clk_of_add_provider(np, ctx);
>  }
>  CLK_OF_DECLARE(exynos5433_cmu_apollo, "samsung,exynos5433-cmu-apollo",
>   exynos5433_cmu_apollo_init);
> @@ -3806,23 +3819,35 @@ static struct samsung_gate_clock atlas_gate_clks[] 
> __initdata = {
>   CLK_IGNORE_UNUSED | CLK_SET_RATE_PARENT, 0),
>  };
>  
> -static struct samsung_cmu_info atlas_cmu_info __initdata = {
> - .pll_clks   = atlas_pll_clks,
> - .nr_pll_clks= ARRAY_SIZE(atlas_pll_clks),
> - .mux_clks   = atlas_mux_clks,
> - .nr_mux_clks= ARRAY_SIZE(atlas_mux_clks),
> - .div_clks   = atlas_div_clks,
> - .nr_div_clks= ARRAY_SIZE(atlas_div_clks),
> - .gate_clks  = atlas_gate_clks,
> - .nr_gate_clks   = ARRAY_SIZE(atlas_gate_clks),
> - .nr_clk_ids = ATLAS_NR_CLK,
> - .clk_regs   = atlas_clk_regs,
> - .nr_clk_regs= ARRAY_SIZE(atlas_clk_regs),
> -};
> -
>  static void __init exynos5433_cmu_atlas_init(struct device_node *np)
>  {
> - samsung_cmu_register_one(np, &atlas_cmu_info);
> + void __iomem *reg_base;
> + struct samsung_clk_provider *ctx;
> +
> + reg_base = of_iomap(np, 0);
> + if (!reg_base) {
> + panic("%s: failed to map registers\n", __func__);
> + return;

Return is useless here.

> + }
> +
> + ctx = samsung_clk_init(np, reg_base, ATLAS_NR_CLK);
> + if (!ctx) {
> + panic("%s: unable to allocate ctx\n", __func__);
> + return;
> + }

This entire if() is useless. The samsung_clk_init() already panics. I
recently tried to make it consistent across our drivers:
http://www.spinics.net/lists/arm-kernel/msg503014.html

Beside that, looks fine.

Best regards,
Krzysztof

Re: [PATCH 08/10] m68k: Add

2016-05-25 Thread George Spelvin

> With my comment above, you wouldn't need this, but I'm gonna comment anyway.
> 
> We don't use special GCCs to target specific CPU variants. Hence inside the
> kernel, you should check the config symbols, to see if support for 68000 or
> 68010 (which isn't supported by the kernel yet) is enabled.

Do you remember some earlier discussion about the m68k Makefile and old
GCC versions?  In particular, lines like:

cpuflags-$(CONFIG_M525x):= $(call cc-option,-mcpu=5253,-m5200)
cpuflags-$(CONFIG_M5249):= $(call cc-option,-mcpu=5249,-m5200)
cpuflags-$(CONFIG_M520x):= $(call cc-option,-mcpu=5208,-m5200)
cpuflags-$(CONFIG_M5206e)   := $(call cc-option,-mcpu=5206e,-m5200)
cpuflags-$(CONFIG_M5206):= $(call cc-option,-mcpu=5206,-m5200)

The problem is that whether MULU.L exists depends on the targeted
architecture, and *that* depends on this Makefile trickery, not
just CONFIG symbols...

Oh, f*** me.


I misremembered.  That problem exists, but only for DIVU.L.  As I said in
the comments (which I wrote *after* deciding I needed this approach), all
ColdFire have MULU.L.  It's DIVU.L that's missing from some early ones.

You're absolutely right.  MULU.L support can be done perfectly from
CONFIG_ options.


Improved patch coming in a few minutes.  My sincere apologies.

[PATCH v3 2/6] powerpc: pseries/Kconfig: Add qspinlock build config

2016-05-25 Thread Pan Xinhui

pseries will use qspinlock by default.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/platforms/pseries/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index bec90fb..f669323 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -21,6 +21,7 @@ config PPC_PSERIES
select HOTPLUG_CPU if SMP
select ARCH_RANDOM
select PPC_DOORBELL
+   select ARCH_USE_QUEUED_SPINLOCKS
default y
 
 config PPC_SPLPAR
-- 
1.9.1

[PATCH v3 5/6] pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock

2016-05-25 Thread Pan Xinhui

cmpxchg_release is light-wight than cmpxchg, we can gain a better
performace then. On some arch like ppc, barrier impact the performace
too much.

Suggested-by:  Boqun Feng 
Signed-off-by: Pan Xinhui 
---
 kernel/locking/qspinlock_paravirt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/qspinlock_paravirt.h 
b/kernel/locking/qspinlock_paravirt.h
index a5b1248..2bbffe4 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -614,7 +614,7 @@ __visible void __pv_queued_spin_unlock(struct qspinlock 
*lock)
 * unhash. Otherwise it would be possible to have multiple @lock
 * entries, which would be BAD.
 */
-   locked = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);
+   locked = cmpxchg_release(&l->locked, _Q_LOCKED_VAL, 0);
if (likely(locked == _Q_LOCKED_VAL))
return;
 
-- 
1.9.1

[PATCH v3 0/6] powerpc use pv-qpsinlock as the default spinlock implemention

2016-05-25 Thread Pan Xinhui

change from v2:
__spin_yeild_cpu() will yield slices to lpar if target cpu is running.
remove unnecessary rmb() in __spin_yield/wake_cpu.
__pv_wait() will check the *ptr == val.
some commit message change

change fome v1:
separate into 6 pathes from one patch
some minor code changes.

I do several tests on pseries IBM,8408-E8E with 32cpus, 64GB memory.
benchmark test results are below.

2 perf tests:
perf bench futex hash
perf bench futex lock-pi

_testspinlcok__pv-qspinlcok_
|futex hash |   556370 ops  |   629634 ops  |
|futex lock-pi  |   362 ops |   367 ops |

scheduler test:
Test how many loops of schedule() can finish within 10 seconds on all cpus.

_testspinlcok__pv-qspinlcok_
|schedule() loops|  322811921   |   311449290   |

kernel compiling test:
build a linux kernel image to see how long it took

_testspinlcok__pv-qspinlcok_
| compiling takes|  22m |   22m |

Pan Xinhui (6):
  qspinlock: powerpc support qspinlock
  powerpc: pseries/Kconfig: Add qspinlock build config
  powerpc: lib/locks.c: Add cpu yield/wake helper function
  pv-qspinlock: powerpc support pv-qspinlock
  pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock
  powerpc: pseries: Add pv-qspinlock build config/make

 arch/powerpc/include/asm/qspinlock.h   | 39 +++
 arch/powerpc/include/asm/qspinlock_paravirt.h  | 38 ++
 .../powerpc/include/asm/qspinlock_paravirt_types.h | 13 +++
 arch/powerpc/include/asm/spinlock.h| 31 +--
 arch/powerpc/include/asm/spinlock_types.h  |  4 ++
 arch/powerpc/kernel/Makefile   |  1 +
 arch/powerpc/kernel/paravirt.c | 45 ++
 arch/powerpc/lib/locks.c   | 37 ++
 arch/powerpc/platforms/pseries/Kconfig |  9 +
 arch/powerpc/platforms/pseries/setup.c |  5 +++
 kernel/locking/qspinlock_paravirt.h|  2 +-
 11 files changed, 211 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

-- 
1.9.1

[PATCH v3 6/6] powerpc: pseries: Add pv-qspinlock build config/make

2016-05-25 Thread Pan Xinhui

pseries has PowerVM support, the default option is Y.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/kernel/Makefile   | 1 +
 arch/powerpc/platforms/pseries/Kconfig | 8 
 2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 2da380f..ae7c2f1 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -50,6 +50,7 @@ obj-$(CONFIG_PPC_970_NAP) += idle_power4.o
 obj-$(CONFIG_PPC_P7_NAP)   += idle_power7.o
 procfs-y   := proc_powerpc.o
 obj-$(CONFIG_PROC_FS)  += $(procfs-y)
+obj-$(CONFIG_PARAVIRT_SPINLOCKS)   += paravirt.o
 rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI)  := rtas_pci.o
 obj-$(CONFIG_PPC_RTAS) += rtas.o rtas-rtc.o $(rtaspci-y-y)
 obj-$(CONFIG_PPC_RTAS_DAEMON)  += rtasd.o
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index f669323..46632e4 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -128,3 +128,11 @@ config HV_PERF_CTRS
  systems. 24x7 is available on Power 8 systems.
 
   If unsure, select Y.
+
+config PARAVIRT_SPINLOCKS
+   bool "Paravirtialization support for qspinlock"
+   depends on PPC_SPLPAR && QUEUED_SPINLOCKS
+   default y
+   help
+ If platform supports virtualization, for example PowerVM, this option
+ can let guest have a better performace.
-- 
1.9.1

[PATCH v3 3/6] powerpc: lib/locks.c: Add cpu yield/wake helper function

2016-05-25 Thread Pan Xinhui

pv-qspinlock core has pv_wait/pv_kick which will give a better
performace by yielding and kicking cpu at some cases.
lets support them by adding two corresponding helper functions.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/spinlock.h |  4 
 arch/powerpc/lib/locks.c| 33 +
 2 files changed, 37 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 4359ee6..3b65372 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -56,9 +56,13 @@
 /* We only yield to the hypervisor if we are in shared processor mode */
 #define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
 extern void __spin_yield(arch_spinlock_t *lock);
+extern void __spin_yield_cpu(int cpu);
+extern void __spin_wake_cpu(int cpu);
 extern void __rw_yield(arch_rwlock_t *lock);
 #else /* SPLPAR */
 #define __spin_yield(x)barrier()
+#define __spin_yield_cpu(x) barrier()
+#define __spin_wake_cpu(x) barrier()
 #define __rw_yield(x)  barrier()
 #define SHARED_PROCESSOR   0
 #endif
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index a9ebd71..3a58834 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -23,6 +23,39 @@
 #include 
 #include 
 
+void __spin_yield_cpu(int cpu)
+{
+   unsigned int holder_cpu = cpu, yield_count;
+
+   if (cpu == -1) {
+   plpar_hcall_norets(H_CONFER, -1, 0);
+   return;
+   }
+   BUG_ON(holder_cpu >= nr_cpu_ids);
+   yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
+   if ((yield_count & 1) == 0) {
+   /* if target cpu is running, confer slices to lpar*/
+   plpar_hcall_norets(H_CONFER, -1, 0);
+   return;
+   }
+   plpar_hcall_norets(H_CONFER,
+   get_hard_smp_processor_id(holder_cpu), yield_count);
+}
+EXPORT_SYMBOL_GPL(__spin_yield_cpu);
+
+void __spin_wake_cpu(int cpu)
+{
+   unsigned int holder_cpu = cpu, yield_count;
+
+   BUG_ON(holder_cpu >= nr_cpu_ids);
+   yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
+   if ((yield_count & 1) == 0)
+   return; /* virtual cpu is currently running */
+   plpar_hcall_norets(H_PROD,
+   get_hard_smp_processor_id(holder_cpu));
+}
+EXPORT_SYMBOL_GPL(__spin_wake_cpu);
+
 #ifndef CONFIG_QUEUED_SPINLOCKS
 void __spin_yield(arch_spinlock_t *lock)
 {
-- 
1.9.1

[PATCH v3 4/6] pv-qspinlock: powerpc support pv-qspinlock

2016-05-25 Thread Pan Xinhui

As we need let pv-qspinlock-kernel run on all environment which might
have no powervm, we should runtime choose which qspinlock version to
use.

The default pv-qspinlock use native version. pv_lock initialization
should be done in bootstage with irq disabled. And if there is PHYP,
restore pv_lock_ops callbacks to pv version.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/qspinlock.h   | 17 
 arch/powerpc/include/asm/qspinlock_paravirt.h  | 38 ++
 .../powerpc/include/asm/qspinlock_paravirt_types.h | 13 +++
 arch/powerpc/kernel/paravirt.c | 45 ++
 arch/powerpc/platforms/pseries/setup.c |  5 +++
 5 files changed, 118 insertions(+)
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

diff --git a/arch/powerpc/include/asm/qspinlock.h 
b/arch/powerpc/include/asm/qspinlock.h
index 5883954..4728f6e 100644
--- a/arch/powerpc/include/asm/qspinlock.h
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -12,10 +12,27 @@ static inline void native_queued_spin_unlock(struct 
qspinlock *lock)
smp_store_release((u8 *)lock, 0);
 }
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+
+#define __ARCH_NEED_PV_HASH_LOOKUP
+
+#include 
+
+static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
+{
+   pv_queued_spin_lock(lock, val);
+}
+
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   pv_queued_spin_unlock(lock);
+}
+#else
 static inline void queued_spin_unlock(struct qspinlock *lock)
 {
native_queued_spin_unlock(lock);
 }
+#endif
 
 #include 
 
diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h 
b/arch/powerpc/include/asm/qspinlock_paravirt.h
new file mode 100644
index 000..cd17a79
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock_paravirt.h
@@ -0,0 +1,38 @@
+#ifndef CONFIG_PARAVIRT_SPINLOCKS
+#error "do not include this file"
+#endif
+
+#ifndef _ASM_QSPINLOCK_PARAVIRT_H
+#define _ASM_QSPINLOCK_PARAVIRT_H
+
+#include  
+
+extern void pv_lock_init(void);
+extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+extern void __pv_init_lock_hash(void);
+extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+extern void __pv_queued_spin_unlock(struct qspinlock *lock);
+
+static inline void pv_queued_spin_lock(struct qspinlock *lock, u32 val)
+{
+   CLEAR_IO_SYNC;
+   pv_lock_op.lock(lock, val);
+}
+
+static inline void pv_queued_spin_unlock(struct qspinlock *lock)
+{
+   SYNC_IO;
+   pv_lock_op.unlock(lock);
+}
+
+static inline void pv_wait(u8 *ptr, u8 val, int lockcpu)
+{
+   pv_lock_op.wait(ptr, val, lockcpu);
+}
+
+static inline void pv_kick(int cpu)
+{
+   pv_lock_op.kick(cpu);
+}
+
+#endif
diff --git a/arch/powerpc/include/asm/qspinlock_paravirt_types.h 
b/arch/powerpc/include/asm/qspinlock_paravirt_types.h
new file mode 100644
index 000..e1fdeb0
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock_paravirt_types.h
@@ -0,0 +1,13 @@
+#ifndef _ASM_QSPINLOCK_PARAVIRT_TYPES_H
+#define _ASM_QSPINLOCK_PARAVIRT_TYPES_H
+
+struct pv_lock_ops {
+   void (*lock)(struct qspinlock *lock, u32 val);
+   void (*unlock)(struct qspinlock *lock);
+   void (*wait)(u8 *ptr, u8 val, int cpu);
+   void (*kick)(int cpu);
+};
+
+extern struct pv_lock_ops pv_lock_op;
+
+#endif
diff --git a/arch/powerpc/kernel/paravirt.c b/arch/powerpc/kernel/paravirt.c
new file mode 100644
index 000..4f19f7e
--- /dev/null
+++ b/arch/powerpc/kernel/paravirt.c
@@ -0,0 +1,45 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+
+static void __native_queued_spin_unlock(struct qspinlock *lock)
+{
+   native_queued_spin_unlock(lock);
+}
+
+static void __pv_wait(u8 *ptr, u8 val, int cpu)
+{
+   HMT_low();
+   if (READ_ONCE(*ptr) == val)
+   __spin_yield_cpu(cpu);
+   HMT_medium();
+}
+
+static void __pv_kick(int cpu)
+{
+   __spin_wake_cpu(cpu);
+}
+
+struct pv_lock_ops pv_lock_op = {
+   .lock = native_queued_spin_lock_slowpath,
+   .unlock = __native_queued_spin_unlock,
+   .wait = NULL,
+   .kick = NULL,
+};
+EXPORT_SYMBOL(pv_lock_op);
+
+void __init pv_lock_init(void)
+{
+   if (SHARED_PROCESSOR) {
+   __pv_init_lock_hash();
+   pv_lock_op.lock = __pv_queued_spin_lock_slowpath;
+   pv_lock_op.unlock = __pv_queued_spin_unlock;
+   pv_lock_op.wait = __pv_wait;
+   pv_lock_op.kick = __pv_kick;
+   }
+}
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 6e944fc..c9f056e 100644
--- a/arch/powerpc

[PATCH v3 1/6] qspinlock: powerpc support qspinlock

2016-05-25 Thread Pan Xinhui

Base code to enable qspinlock on powerpc. this patch add some #ifdef
here and there. Although there is no paravirt related code, we can
successfully build a qspinlock kernel after apply this patch.

Signed-off-by: Pan Xinhui 
---
 arch/powerpc/include/asm/qspinlock.h  | 22 ++
 arch/powerpc/include/asm/spinlock.h   | 27 +++
 arch/powerpc/include/asm/spinlock_types.h |  4 
 arch/powerpc/lib/locks.c  |  4 
 4 files changed, 45 insertions(+), 12 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h

diff --git a/arch/powerpc/include/asm/qspinlock.h 
b/arch/powerpc/include/asm/qspinlock.h
new file mode 100644
index 000..5883954
--- /dev/null
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -0,0 +1,22 @@
+#ifndef _ASM_POWERPC_QSPINLOCK_H
+#define _ASM_POWERPC_QSPINLOCK_H
+
+#include 
+
+#define SPIN_THRESHOLD (1 << 15)
+#define queued_spin_unlock queued_spin_unlock
+
+static inline void native_queued_spin_unlock(struct qspinlock *lock)
+{
+   /* no load/store can be across the unlock()*/
+   smp_store_release((u8 *)lock, 0);
+}
+
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+   native_queued_spin_unlock(lock);
+}
+
+#include 
+
+#endif /* _ASM_POWERPC_QSPINLOCK_H */
diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 523673d..4359ee6 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,20 @@
 #define SYNC_IO
 #endif
 
+#if defined(CONFIG_PPC_SPLPAR)
+/* We only yield to the hypervisor if we are in shared processor mode */
+#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
+extern void __spin_yield(arch_spinlock_t *lock);
+extern void __rw_yield(arch_rwlock_t *lock);
+#else /* SPLPAR */
+#define __spin_yield(x)barrier()
+#define __rw_yield(x)  barrier()
+#define SHARED_PROCESSOR   0
+#endif
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include 
+#else
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
return lock.slock == 0;
@@ -106,18 +120,6 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)
  * held.  Conveniently, we have a word in the paca that holds this
  * value.
  */
-
-#if defined(CONFIG_PPC_SPLPAR)
-/* We only yield to the hypervisor if we are in shared processor mode */
-#define SHARED_PROCESSOR (lppaca_shared_proc(local_paca->lppaca_ptr))
-extern void __spin_yield(arch_spinlock_t *lock);
-extern void __rw_yield(arch_rwlock_t *lock);
-#else /* SPLPAR */
-#define __spin_yield(x)barrier()
-#define __rw_yield(x)  barrier()
-#define SHARED_PROCESSOR   0
-#endif
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
CLEAR_IO_SYNC;
@@ -169,6 +171,7 @@ extern void arch_spin_unlock_wait(arch_spinlock_t *lock);
do { while (arch_spin_is_locked(lock)) cpu_relax(); } while (0)
 #endif
 
+#endif /* !CONFIG_QUEUED_SPINLOCKS */
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
diff --git a/arch/powerpc/include/asm/spinlock_types.h 
b/arch/powerpc/include/asm/spinlock_types.h
index 2351adc..bd7144e 100644
--- a/arch/powerpc/include/asm/spinlock_types.h
+++ b/arch/powerpc/include/asm/spinlock_types.h
@@ -5,11 +5,15 @@
 # error "please don't include this file directly"
 #endif
 
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include 
+#else
 typedef struct {
volatile unsigned int slock;
 } arch_spinlock_t;
 
 #define __ARCH_SPIN_LOCK_UNLOCKED  { 0 }
+#endif
 
 typedef struct {
volatile signed int lock;
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index f7deebd..a9ebd71 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 
+#ifndef CONFIG_QUEUED_SPINLOCKS
 void __spin_yield(arch_spinlock_t *lock)
 {
unsigned int lock_value, holder_cpu, yield_count;
@@ -42,6 +43,7 @@ void __spin_yield(arch_spinlock_t *lock)
get_hard_smp_processor_id(holder_cpu), yield_count);
 }
 EXPORT_SYMBOL_GPL(__spin_yield);
+#endif
 
 /*
  * Waiting for a read lock or a write lock on a rwlock...
@@ -69,6 +71,7 @@ void __rw_yield(arch_rwlock_t *rw)
 }
 #endif
 
+#ifndef CONFIG_QUEUED_SPINLOCKS
 void arch_spin_unlock_wait(arch_spinlock_t *lock)
 {
smp_mb();
@@ -84,3 +87,4 @@ void arch_spin_unlock_wait(arch_spinlock_t *lock)
 }
 
 EXPORT_SYMBOL(arch_spin_unlock_wait);
+#endif
-- 
1.9.1

Re: [PATCH 08v2/10] m68k: Add

2016-05-25 Thread George Spelvin

This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
for the original mc68000, which lacks a 32x32-bit multiply instruction.

Yes, the amount of optimization effort put in is excessive. :-)

Addition chains found by Yevgen Voronenko's Hcub algorithm at
http://spiral.ece.cmu.edu/mcm/gen.html

Signed-off-by: George Spelvin 
Cc: Geert Uytterhoeven 
Cc: Greg Ungerer 
Cc: linux-m...@lists.linux-m68k.org
---
 arch/m68k/Kconfig.cpu|  1 +
 arch/m68k/include/asm/archhash.h | 58 
 2 files changed, 59 insertions(+)
 create mode 100644 arch/m68k/include/asm/archhash.h

diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 0dfcf128..bf3de464 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -40,6 +40,7 @@ config M68000
select CPU_HAS_NO_MULDIV64
select CPU_HAS_NO_UNALIGNED
select GENERIC_CSUM
+   select HAVE_ARCH_HASH
help
  The Freescale (was Motorola) 68000 CPU is the first generation of
  the well known M68K family of processors. The CPU core as well as
diff --git a/arch/m68k/include/asm/archhash.h b/arch/m68k/include/asm/archhash.h
new file mode 100644
index ..2532cf92
--- /dev/null
+++ b/arch/m68k/include/asm/archhash.h
@@ -0,0 +1,58 @@
+#ifndef _ASM_ARCHHASH_H
+#define _ASM_ARCHHASH_H
+
+/*
+ * The only 68k processors that lack MULU.L and so need this workaround
+ * are the original 68000 and 68010.
+ */
+#if defined(CONFIG_M68000) || defined(CONFIG_M68010)
+
+#define HAVE_ARCH__HASH_32 1
+/*
+ * While it would be legal to substitute a different hash operation
+ * entirely, let's keep it simple and just use an optimized multiply
+ * by GOLDEN_RATIO_32 = 0x61C88647.
+ *
+ * The best way to do that appears to be to multiply by 0x8647 with
+ * shifts and adds, and use mulu.w to multiply the high half by 0x61C8.
+ *
+ * Because the 68000 has multi-cycle shifts, this addition chain is
+ * chosen to minimise the shift distances.
+ *
+ * Despite every attempt to spoon-feed GCC simple operations, GCC 6.1.1
+ * doggedly insists on doing annoying things like converting "lsl.l #2,"
+ * (12 cycles) to two adds (8+8 cycles).
+ *
+ * It also likes to notice two shifts in a row, like "a = x << 2" and
+ * "a <<= 7", and convert that to "a = x << 9".  But shifts longer than
+ * 8 bits are extra-slow on m68k, so that's a lose.
+ *
+ * Since the 68000 is a very simple in-order processor with no instruction
+ * scheduling effects on execution time, we can safely take it out of GCC's
+ * hands and write one big asm() block.
+ *
+ * Without calling overhead, this operation is 30 bytes (14 instructions
+ * plus one immediate constant) and 166 cycles.
+ */
+static inline u32 __attribute_const__ __hash_32(u32 x)
+{
+   u32 a, b;
+
+   asm(   "move.l %2,%0"   /* 0x0001 */
+   "\n lsl.l #2,%0"/* 0x0004 */
+   "\n move.l %0,%1"
+   "\n lsl.l #7,%0"/* 0x0200 */
+   "\n add.l %2,%0"/* 0x0201 */
+   "\n add.l %0,%1"/* 0x0205 */
+   "\n add.l %0,%0"/* 0x0402 */
+   "\n add.l %0,%1"/* 0x0607 */
+   "\n lsl.l #5,%0"/* 0x8040 */
+   /* 0x8647 */
+   : "=&d" (a), "=&r" (b)
+   : "g" (x));
+
+   return ((u16)(x*0x61c8) << 16) + a + b;
+}
+#endif /* HAVE_ARCH__HASH_32 */
+
+#endif /* _ASM_ARCHHASH_H */
-- 
2.8.1

Re: [git pull] drm for v4.7

2016-05-25 Thread Jani Nikula

On Wed, 25 May 2016, Stephen Rothwell  wrote:
> My bad.  That warning turned up in linux-next last Wednesday but I
> didn't notice (I have other stuff to do and don't carefully watch all
> the builds all day - and there are quite a few warnings to filter new
> ones out out of).  Maybe I need some automated way to flag new warnings.

There may be better ones out there, but Artem's "aiaiai" has some
helpers [1] for diffing build logs, if you want something simple to
integrate into existing scripts.

BR,
Jani.


[1] http://git.infradead.org/users/dedekind/aiaiai.git/tree/HEAD:/helpers


-- 
Jani Nikula, Intel Open Source Technology Center

Re: ARM: dts: exynos: Add MFC memory banks for Peach boards

2016-05-25 Thread pankaj.dubey

Hi Javier,

On Friday 29 April 2016 12:51 AM, Javier Martinez Canillas wrote:
> The MFC nodes with the memory regions reserved for memory allocations
> are missing in the Exynos5420 Peach Pit and Exynos5800 Peach Pi DTS.
> 
> This causes the s5p-mfc driver probe to fail with the following error:
> 
> [4.140647] s5p_mfc_alloc_memdevs:1072: Failed to declare coherent memory 
> for MFC device
> [4.216163] s5p-mfc: probe of 1100.codec failed with error -12
> 
> Add the missing nodes so the driver probes and the {en,de}coder video
> nodes are registered correctly:
> 
> [4.096277] s5p-mfc 1100.codec: decoder registered as /dev/video4
> [4.102282] s5p-mfc 1100.codec: encoder registered as /dev/video5
> 
> Signed-off-by: Javier Martinez Canillas 

Just noticed that, current krzk/for-next failed to boot on Exynos5880
based Chromebook device. Git bisect is showing culprit as this patch.
When I reverted this patch, its able to boot normally.
Is there any missing patches that we need to take on krzk/for-next to
boot on Chromebook.

Thanks,
Pankaj Dubey

Re: [PATCH 2/3] clk: samsung: cpu: prepare for adding Exynos5433 CPU clocks

2016-05-25 Thread Krzysztof Kozlowski

On 05/24/2016 03:19 PM, Bartlomiej Zolnierkiewicz wrote:
> Exynos5433 uses different register layout for CPU clock registers
> than earlier SoCs so add new code for handling this layout.  Also
> add new CLK_CPU_HAS_E5433_REGS_LAYOUT flag to request using it.
> 
> There should be no functional change resulting from this patch.
> 
> Cc: Kukjin Kim 
> CC: Krzysztof Kozlowski 
> Signed-off-by: Bartlomiej Zolnierkiewicz 
> ---
>  drivers/clk/samsung/clk-cpu.c | 131 
> +-
>  drivers/clk/samsung/clk-cpu.h |   4 +-
>  2 files changed, 133 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/clk/samsung/clk-cpu.c b/drivers/clk/samsung/clk-cpu.c
> index 813003d..8bf7e80 100644
> --- a/drivers/clk/samsung/clk-cpu.c
> +++ b/drivers/clk/samsung/clk-cpu.c
> @@ -45,6 +45,13 @@
>  #define E4210_DIV_STAT_CPU0  0x400
>  #define E4210_DIV_STAT_CPU1  0x404
>  
> +#define E5433_MUX_SEL2   0x008
> +#define E5433_MUX_STAT2  0x208
> +#define E5433_DIV_CPU0   0x400
> +#define E5433_DIV_CPU1   0x404
> +#define E5433_DIV_STAT_CPU0  0x500
> +#define E5433_DIV_STAT_CPU1  0x504

I got a problem matching it with datasheed. The base is 0x200?

> +
>  #define E4210_DIV0_RATIO0_MASK   0x7
>  #define E4210_DIV1_HPM_MASK  (0x7 << 4)
>  #define E4210_DIV1_COPY_MASK (0x7 << 0)
> @@ -253,6 +260,102 @@ static int exynos_cpuclk_post_rate_change(struct 
> clk_notifier_data *ndata,
>  }
>  
>  /*
> + * Helper function to set the 'safe' dividers for the CPU clock. The 
> parameters
> + * div and mask contain the divider value and the register bit mask of the
> + * dividers to be programmed.
> + */
> +static void exynos5433_set_safe_div(void __iomem *base, unsigned long div,
> + unsigned long mask)

Please align the argument.

> +{
> + unsigned long div0;
> +
> + div0 = readl(base + E5433_DIV_CPU0);
> + div0 = (div0 & ~mask) | (div & mask);
> + writel(div0, base + E5433_DIV_CPU0);
> + wait_until_divider_stable(base + E5433_DIV_STAT_CPU0, mask);
> +}
> +
> +/* handler for pre-rate change notification from parent clock */
> +static int exynos5433_cpuclk_pre_rate_change(struct clk_notifier_data *ndata,
> + struct exynos_cpuclk *cpuclk, void __iomem *base)
> +{
> + const struct exynos_cpuclk_cfg_data *cfg_data = cpuclk->cfg;
> + unsigned long alt_prate = clk_get_rate(cpuclk->alt_parent);
> + unsigned long alt_div = 0, alt_div_mask = DIV_MASK;
> + unsigned long div0, div1 = 0, mux_reg;
> + unsigned long flags;
> +
> + /* find out the divider values to use for clock data */
> + while ((cfg_data->prate * 1000) != ndata->new_rate) {
> + if (cfg_data->prate == 0)
> + return -EINVAL;
> + cfg_data++;
> + }
> +
> + spin_lock_irqsave(cpuclk->lock, flags);
> +
> + /*
> +  * For the selected PLL clock frequency, get the pre-defined divider
> +  * values.
> +  */
> + div0 = cfg_data->div0;
> + div1 = cfg_data->div1;
> +
> + /*
> +  * If the old parent clock speed is less than the clock speed of
> +  * the alternate parent, then it should be ensured that at no point
> +  * the armclk speed is more than the old_prate until the dividers are
> +  * set.  Also workaround the issue of the dividers being set to lower
> +  * values before the parent clock speed is set to new lower speed
> +  * (this can result in too high speed of armclk output clocks).

In entire sentence: s/speed/rate/

> +  */
> + if (alt_prate > ndata->old_rate || ndata->old_rate > ndata->new_rate) {
> + unsigned long tmp_rate = min(ndata->old_rate, ndata->new_rate);
> +
> + alt_div = DIV_ROUND_UP(alt_prate, tmp_rate) - 1;
> + WARN_ON(alt_div >= MAX_DIV);
> +
> + exynos5433_set_safe_div(base, alt_div, alt_div_mask);
> + div0 |= alt_div;
> + }
> +
> + /* select the alternate parent */
> + mux_reg = readl(base + E5433_MUX_SEL2);
> + writel(mux_reg | 1, base + E5433_MUX_SEL2);
> + wait_until_mux_stable(base + E5433_MUX_STAT2, 0, 2);
> +
> + /* alternate parent is active now. set the dividers */
> + writel(div0, base + E5433_DIV_CPU0);
> + wait_until_divider_stable(base + E5433_DIV_STAT_CPU0, DIV_MASK_ALL);
> +
> + writel(div1, base + E5433_DIV_CPU1);
> + wait_until_divider_stable(base + E5433_DIV_STAT_CPU1, DIV_MASK_ALL);
> +
> + spin_unlock_irqrestore(cpuclk->lock, flags);

One blank line please.

> + return 0;
> +}
> +
> +/* handler for post-rate change notification from parent clock */
> +static int exynos5433_cpuclk_post_rate_change(struct clk_notifier_data 
> *ndata,
> + struct exynos_cpuclk *cpuclk, void __iomem *base)
> +{
> + unsigned long div = 0, div_mask = DIV_MASK;
> + unsigned long mux_reg;
> + unsigned long flags;
> +
> + spin_lock_irqsave(cpuclk

Re: fsl-dcu not works on latest "drm-next"

2016-05-25 Thread Alexander Stein

On Tuesday 24 May 2016 23:20:02, Stefan Agner wrote:
> On 2016-05-24 19:14, Meng Yi wrote:
> > I found that its regmap endianness issue, so I want to replace the
> > "regmap".
> Hm, replace with what? Note that we need some kind of endianness
> convertion since the IP is big endian on LS1021a and little endian on
> Vybrid (vf610).

Yep, regmap is required and was broken meanwhile but should be fixed now. See 
linked lkml post.

> Is it maybe just an issue with regmap/the big-endian property in the
> device tree? Maybe this thread is interesting for you:
> https://lkml.org/lkml/2016/3/23/233

AFAICT device tree should not been changed here. The "big-endian" property was 
there fromt he beginning.

> > I just tested the latest drm-next branch on Freescale/NXP ls1021a-twr,
> > and got some log below. And fsl-dcu not works.
> > 
> > Since "drm-next" merged some branch , use git bisect had some problem ,
> > 
> > so I manually checked out that "fsl-dcu" works at
> > d761701c55a99598477f3cb25c03d939a7711e74
> > 
> > And not works now. some log below:

Which commit actually broke your kernel? And where to fetch it from? Is your 
problem really caused by regmap?

Best regards,
Alexander

Re: [PATCH RFC kernel] balloon: speed up inflating/deflating process

2016-05-25 Thread Michael S. Tsirkin

On Wed, May 25, 2016 at 01:00:27AM +, Li, Liang Z wrote:
> It should be changed if we are going to use a small page bitmap.

Exactly.

Re: can't boot with reiserfs on linux-4.6.0+

2016-05-25 Thread Jeff Chua

On Wed, May 25, 2016 at 12:10 AM, Linus Torvalds
 wrote:
> On Tue, May 24, 2016 at 8:59 AM, Al Viro  wrote:
>>
>> Umm...  Any chance of getting the function names to go with the addresses?
>> I'll try to reproduce it here, but the things would be easier with that
>> information...
>
> Yeah, we shouldn't even allow non-KALLSYMS builds. In fact, unless you
> pick EXPERT (which you shouldn't, unless you're doing some embedded
> development) you can't even disable it.
>
> Jeff, please don't use non-KALLSYMS builds. They are completely undebuggable.
>
>  Linus

Got it. Will compile with CONFIG_KALLSYMS=y :)

Thanks,
Jeff

Re: [RFC PATCH 1/2] Input: rotary-encoder- Add support for absolute encoder

2016-05-25 Thread Vignesh R

Hi Dmitry,

On 05/23/2016 02:48 PM, R, Vignesh wrote:
> 
> 
> On 5/20/2016 10:04 PM, Dmitry Torokhov wrote:
>> On Thu, May 19, 2016 at 02:34:00PM +0530, Vignesh R wrote:
>>> There are rotary-encoders where GPIO lines reflect the actual position
>>> of the rotary encoder dial. For example, if dial points to 9, then four
>>> GPIO lines connected to the rotary encoder will read HLLH(1001b = 9).
>>> Add support for such rotary-encoder.
>>> The driver relies on rotary-encoder,absolute-encoder DT property to
>>> detect such encoders.
>>> Since, GPIO IRQs are not necessary to work with
>>> such encoders, optional polling mode support is added using
>>> input_poll_dev skeleton. This is can be used by enabling
>>> CONFIG_INPUT_GPIO_ROTARY_ENCODER_POLL_MODE_SUPPORT.
>>
>> Does this really belong to a rotary encoder and not a new driver that
>> simply translates gpio-encoded value into ABS* event?
>>
> 
> Currently rotary encoder driver only supports incremental/step counting
> rotary devices. However, the device that is there on am335x-ice is an
> absolute encoder but, IMO, nevertheless a kind of rotary encoder. The
> only difference is that there is no need to count steps and the absolute
> position value is always available as binary encoded state of connected
> GPIOs.
> The hardware on am335x-ice is a mechanical rotary encoder switch
> connected over 4 GPIOs. It is same as binary encoder described at [1]
> (except there are 4 GPIO lines), so this lead me to add support in
> rotary-encoder.
> 
> [1]https://en.wikipedia.org/wiki/Rotary_encoder#Standard_binary_encoding
> 

Could you please comment on how would you like to support above
described encoder: As a new driver or with existing driver with new
compatible/mode setting via DT or as suggest by Uwe in another reply?
IMHO, supporting using existing driver with new mode/compatible string
looks a better option as the hardware is a kind of rotary-encoder.

-- 
Regards
Vignesh

Re: [RFC PATCH 1/2] Input: rotary-encoder- Add support for absolute encoder

2016-05-25 Thread Vignesh R

On 05/24/2016 01:50 PM, Uwe Kleine-König wrote:
> Hello,
> 
> On Tue, May 24, 2016 at 10:39:26AM +0530, Vignesh R wrote:
>> On 05/23/2016 06:48 PM, Uwe Kleine-König wrote:
>>> On Mon, May 23, 2016 at 04:48:40PM +0530, R, Vignesh wrote:
 On 5/22/2016 3:56 PM, Uwe Kleine-König wrote:
> On Thu, May 19, 2016 at 02:34:00PM +0530, Vignesh R wrote:
[...]
>>>
>>> OK, we have code that is more complex than it needs to be for your
>>> device. But your device is a special case of the supported devices, so
>>> I'd say don't bother that there is more logic in the driver than you
>>> need and be lucky.
>>
>> More complexity is just a overhead. Since, encoder can be turned at a
>> rate faster than handling of IRQs (rotary_encoder_quarter_period_irq()
>> is threaded IRQ hence, priority is not close to real time), some states
> 
> This problem isn't unique to your hardware. An "ordinary" encoder with
> just two GPIOs and more than one period can be rotated faster than
> 1/irq_latency, too.

But my hardware should not be affected by this problem. The whole point
of absolute encoder is to overcome the difficulty of software keeping
track of steps, one can read the gpios state anytime and find out what
value is the encoder pointing to.

> There are two things that can be done:
> 
>  - undo the conversion to threaded irqs; or
>  - read out the gpios in the fast handler and only delay decoding and
>reporting of the event
> 
> Both approaches have their disadvantages.
> 

And both cannot guarantee that an event is not missed (on a loaded
system) and all of the state logic goes for a toss. For binary encoding
multiple IRQs(thats why incremental encoders are usually gray coded)
will fire at same time and handling all of the them is a considerable
overhead, there is good chance that some events are missed.

>> can be missed. rotary_encoder_quarter_period_irq() is not robust in this
>> case, reading gpios directly is more suitable option. I see similar
>> views expressed in previously[1]
>>
>> [1]http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/391196.html
> 
> IMHO the right thing to do is to improve
> rotary_encoder_quarter_period_irq (and also the other handlers for full
> and half period mode) to make use of additional GPIOs. 

I doubt there exists any incremental encoder with more than 2
gpios(except for the extra reference GPIO or Z signal). So modifying
full/half period handlers maybe unnecessary.

> This way all types of devices benefit and more code is shared.

Sorry.. but IMHO, there is little code sharing and more complexity. So,
I will leave it to the maintainer to decide whats the best approach here.

-- 
Regards
Vignesh

RE: [PATCH RFC kernel] balloon: speed up inflating/deflating process

2016-05-25 Thread Li, Liang Z

> > > Suggestion to address all above comments:
> > >   1. allocate a bunch of pages and link them up,
> > >  calculating the min and the max pfn.
> > >  if max-min exceeds the allocated bitmap size,
> > >  tell host.
> >
> > I am not sure if it works well in some cases, e.g. The allocated pages
> > are across a wide range and the max-min > limit is very frequently to be
> true.
> > Then, there will be many times of virtio transmission and it's bad for
> > performance improvement. Right?
> 
> It's a tradeoff for sure. Measure it, see what the overhead is.
> 

Hi MST,

I have measured the performance when using a 32K page bitmap, and inflate the 
balloon to 3GB
of an idle guest with 4GB RAM.

Now: 
total inflating time: 338ms
the count of virtio data transmission:  373
the call count of madvise: 865

before:
total inflating time: 175ms
the count of virtio data transmission: 1
the call count of madvise: 42

Maybe the result will be worse if the guest is not idle, or the guest has more 
RAM.
Do you want more data?

Is it worth to do that?

Liang

> >
> > >   2. limit allocated bitmap size to something reasonable.
> > >  How about 32Kbytes? This is 256kilo bit in the map, which comes
> > >  out to 1Giga bytes of memory in the balloon.
> >
> > So, even the VM has 1TB of RAM, the page bitmap will take 32MB of
> memory.
> > Maybe it's better to use a big page bitmap the save the pages
> > allocated by balloon, and split the big page bitmap to 32K bytes unit, then
> transfer one unit at a time.
> 
> How is this different from what I said?
> 
> >
> > Should we use a page bitmap to replace 'vb->pages' ?
> >
> > How about rolling back to use PFNs if the count of requested pages is a
> small number?
> >
> > Liang
> 
> That's why we have start pfn. you can use that to pass even a single page
> without a lot of overhead.
> 
> > > > --
> > > > 1.9.1
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > > the body of a message to majord...@vger.kernel.org More majordomo
> > > info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 08v2/10] m68k: Add

2016-05-25 Thread Geert Uytterhoeven

On Wed, May 25, 2016 at 10:24 AM, George Spelvin
 wrote:
> --- a/arch/m68k/Kconfig.cpu
> +++ b/arch/m68k/Kconfig.cpu
> @@ -40,6 +40,7 @@ config M68000
> select CPU_HAS_NO_MULDIV64
> select CPU_HAS_NO_UNALIGNED
> select GENERIC_CSUM
> +   select HAVE_ARCH_HASH
> help
>   The Freescale (was Motorola) 68000 CPU is the first generation of
>   the well known M68K family of processors. The CPU core as well as
> diff --git a/arch/m68k/include/asm/archhash.h 
> b/arch/m68k/include/asm/archhash.h
> new file mode 100644
> index ..2532cf92
> --- /dev/null
> +++ b/arch/m68k/include/asm/archhash.h
> @@ -0,0 +1,58 @@
> +#ifndef _ASM_ARCHHASH_H
> +#define _ASM_ARCHHASH_H
> +
> +/*
> + * The only 68k processors that lack MULU.L and so need this workaround
> + * are the original 68000 and 68010.
> + */
> +#if defined(CONFIG_M68000) || defined(CONFIG_M68010)

As I said before, I don't think you need this check, given HAVE_ARCH_HASH is
selected by M68000, and M68010 doesn't exist.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [PATCH v4] input: tablet: add Pegasus Notetaker tablet driver

2016-05-25 Thread Oliver Neukum

On Wed, 2016-05-25 at 09:44 +0200, Martin Kepplinger wrote:
> This adds a driver for the Pegasus Notetaker Pen. When connected,
> this uses the Pen as an input tablet.
> 
> This device was sold in various different brandings, for example
>   "Pegasus Mobile Notetaker M210",
>   "Genie e-note The Notetaker",
>   "Staedtler Digital ballpoint pen 990 01",
>   "IRISnotes Express" or
>   "NEWLink Digital Note Taker".
> 
> Here's an example, so that you know what we are talking about:
> http://www.staedtler.com/en/products/ink-writing-instruments/ballpoint-pens/digital-pen-990-01-digital-ballpoint-pen
> 
> http://pegatech.blogspot.com/ seems to be a remaining official resource.
> 
> This device can also transfer saved (offline recorded handwritten) data and
> there are userspace programs that do this, see https://launchpad.net/m210
> (Well, alternatively there are really fast scanners out there :)
> 
> It's *really* fun to use as an input tablet though! So let's support this
> for everybody.
> 
> There's no way to disable the device. When the pen is out of range, we just
> don't get any URBs and don't do anything.
> Like all other mouses or input tablets, we don't use runtime PM.
> 
> Signed-off-by: Martin Kepplinger 
> ---
> 
> Thanks for having a look. Any more suggestions on this?
> 
> revision history
> 
> v4 use normal work queue instead of a kernel thread (thanks to Oliver Neukum)
> v3 fix reporting low pen battery and add USB list to CC
> v2 minor cleanup (remove unnecessary variables)
> v1 initial release
> 

Hi,

almost there.

Regards
Oliver

> +static void pegasus_close(struct input_dev *dev)
> +{
> + struct pegasus *pegasus = input_get_drvdata(dev);
> +
> + cancel_work_sync(&pegasus->init);
> +
> + usb_kill_urb(pegasus->irq);
> +}

This is a race condition. The URB can trigger the work.
Therefore the URB needs to die first.

Re: [PATCH v6 1/2] soc: samsung: add exynos chipid driver support

2016-05-25 Thread Arnd Bergmann

On Wednesday, May 25, 2016 1:28:23 PM CEST Pankaj Dubey wrote:
> Exynos SoCs have Chipid, for identification of product IDs
> and SoC revisions. This patch intends to provide initialization
> code for all these functionalities, at the same time it provides some
> sysfs entries for accessing these information to user-space.
> 
> This driver uses existing binding for exynos-chipid.
> 
> CC: Grant Likely 
> CC: Rob Herring 
> CC: Linus Walleij 
> Signed-off-by: Pankaj Dubey 
> ---
>  drivers/soc/samsung/Kconfig|   5 +
>  drivers/soc/samsung/Makefile   |   1 +
>  drivers/soc/samsung/exynos-chipid.c| 172 
> +
>  include/linux/soc/samsung/exynos-soc.h |  51 ++
> 

I don't like how this exposes the internals of the samsung SoC in a global 
header
file, after we spent a considerable amount of work on keeping it confined
to arch/arm/{mach-exynos,mach-s3c64xx,plat-samsung}.

Please remove the external interface of the driver, in particular the global
data structure. We keep coming back to this for a lot of platforms, and
I still think we should have an architecture-independent way of matching
platforms to struct soc_device, using an exported function from 
drivers/base/soc.c
that uses glob_match() to compare a platform string against the running system.

Arnd

Re: [PATCH 00/10] String hash improvements

2016-05-25 Thread Geert Uytterhoeven

On Wed, May 25, 2016 at 10:11 AM, George Spelvin
 wrote:
> Geert Uytterhoeven wrote:
>> Usually this is handled through include/asm-generic/.
>> Put the generic default implementation in include/asm-generic/hash.h.
>>
>> Architectures that need to override provide their own version, e.g.
>> arch/m68k/include/asm/hash.h. They may #include 
>> if they still want to reuse parts of the generic implementation.
>>
>> Other architectures add "generic-y += hash.h" to their
>> arch//include/asm/Kbuild.
>
> I thought about that, but then I'd have to edit *every* architecture,
> and might need acks from all the maintainers.
>
> I was looking for something that was a total no-op on most architectures.
>
> But if this is preferred, it's not technically difficult at all.

As you only include  if CONFIG_HAVE_ARCH_HASH
is defined, you can also just call the arch-specific one .

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: [RFC 1/3] block: Introduce blk_bio_map_sg() to map one bio

2016-05-25 Thread Ming Lei

On Wed, May 25, 2016 at 2:12 PM, Baolin Wang  wrote:
> In dm-crypt, it need to map one bio to scatterlist for improving the
> hardware engine encryption efficiency. Thus this patch introduces the
> blk_bio_map_sg() function to map one bio with scatterlists.
>
> Signed-off-by: Baolin Wang 
> ---
>  block/blk-merge.c  |   45 +
>  include/linux/blkdev.h |3 +++
>  2 files changed, 48 insertions(+)
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 2613531..9b92af4 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -417,6 +417,51 @@ single_segment:
>  }
>
>  /*
> + * map a bio to scatterlist, return number of sg entries setup.
> + */
> +int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
> +  struct scatterlist *sglist,
> +  struct scatterlist **sg)
> +{
> +   struct bio_vec bvec, bvprv = { NULL };
> +   struct bvec_iter iter;
> +   int nsegs, cluster;
> +
> +   nsegs = 0;
> +   cluster = blk_queue_cluster(q);
> +
> +   if (bio->bi_rw & REQ_DISCARD) {
> +   /*
> +* This is a hack - drivers should be neither modifying the
> +* biovec, nor relying on bi_vcnt - but because of
> +* blk_add_request_payload(), a discard bio may or may not 
> have
> +* a payload we need to set up here (thank you Christoph) and
> +* bi_vcnt is really the only way of telling if we need to.
> +*/
> +
> +   if (bio->bi_vcnt)
> +   goto single_segment;
> +
> +   return 0;
> +   }
> +
> +   if (bio->bi_rw & REQ_WRITE_SAME) {
> +single_segment:
> +   *sg = sglist;
> +   bvec = bio_iovec(bio);
> +   sg_set_page(*sg, bvec.bv_page, bvec.bv_len, bvec.bv_offset);
> +   return 1;
> +   }
> +
> +   bio_for_each_segment(bvec, bio, iter)
> +   __blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
> +&nsegs, &cluster);
> +
> +   return nsegs;
> +}
> +EXPORT_SYMBOL(blk_bio_map_sg);

You can use __blk_bios_map_sg() to implement blk_bio_map_sg(),
then code duplication may be avoided.

> +
> +/*
>   * map a request to scatterlist, return number of sg entries setup. Caller
>   * must make sure sg can hold rq->nr_phys_segments entries
>   */
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 1fd8fdf..e5de4f8 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -1013,6 +1013,9 @@ extern void blk_queue_write_cache(struct request_queue 
> *q, bool enabled, bool fu
>  extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device 
> *bdev);
>
>  extern int blk_rq_map_sg(struct request_queue *, struct request *, struct 
> scatterlist *);
> +extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
> + struct scatterlist *sglist,
> + struct scatterlist **sg);
>  extern void blk_dump_rq_flags(struct request *, char *);
>  extern long nr_blockdev_pages(void);
>
> --
> 1.7.9.5
>

[PATCH] clk: rockchip: add a dummy clock for the watchdog pclk on rk3399

2016-05-25 Thread Xing Zheng

Like rk3288, the pclk supplying the watchdog is controlled via the
SGRF register area. Additionally the SGRF isn't even writable in
every boot mode.

But still the clock control is available and in the future someone
might want to use it. Therefore define a simple clock for the time
being so that the watchdog driver can read its rate.

Signed-off-by: Xing Zheng 
---

 drivers/clk/rockchip/clk-rk3399.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/clk/rockchip/clk-rk3399.c 
b/drivers/clk/rockchip/clk-rk3399.c
index 291543f..b6742fa 100644
--- a/drivers/clk/rockchip/clk-rk3399.c
+++ b/drivers/clk/rockchip/clk-rk3399.c
@@ -1498,6 +1498,7 @@ static void __init rk3399_clk_init(struct device_node *np)
 {
struct rockchip_clk_provider *ctx;
void __iomem *reg_base;
+   struct clk *clk;
 
reg_base = of_iomap(np, 0);
if (!reg_base) {
@@ -1511,6 +1512,14 @@ static void __init rk3399_clk_init(struct device_node 
*np)
return;
}
 
+   /* Watchdog pclk is controlled by RK3399 SECURE_GRF_SOC_CON3[8]. */
+   clk = clk_register_fixed_factor(NULL, "pclk_wdt", "pclk_alive", 0, 1, 
1);
+   if (IS_ERR(clk))
+   pr_warn("%s: could not register clock pclk_wdt: %ld\n",
+   __func__, PTR_ERR(clk));
+   else
+   rockchip_clk_add_lookup(ctx, clk, PCLK_WDT);
+
rockchip_clk_register_plls(ctx, rk3399_pll_clks,
   ARRAY_SIZE(rk3399_pll_clks), -1);
 
-- 
1.7.9.5

Re: [PATCH v2 10/32] perf/x86/intel/cqm: introduce (I)state and limbo prmids

2016-05-25 Thread Thomas Gleixner

On Tue, 24 May 2016, David Carrillo-Cisneros wrote:
> >> +static inline bool __pmonr__in_instate(struct pmonr *pmonr)
> >> +{
> >> + lockdep_assert_held(&__pkg_data(pmonr, pkg_data_lock));
> >> + return __pmonr__in_istate(pmonr) && !__pmonr__in_ilstate(pmonr);
> >>  }
> >
> > This state tracking sucks. It's completely non obvious which combinations of
> > members are denoting a certain state.
> >
> > What's wrong with having:
> >
> >pmonr->state
> >
> > and a enum
> >
> > enum pmonr_state {
> >  PMONR_UNUSED,
> >  PMONR_ACTIVE,
> >  PMONR_LIMBO,
> >  PMONR_INHERITED,
> > };
> >
> > That would make all this horror readable and understandable. I bet you can't
> > remember the meaning of all this state stuff 3 month from now. That's going 
> > to
> > be the hell of a ride to track down a problem in this code.
> 
> In the pmonr, the state can be inferred by the values of:
>   - pmonr->ancestor_pmonr
>   - pmonr->prmid
>   - pmonr->limbo_prmid

And exaclty that stuff drives me nuts. You update stuff here and there and
then you infer the state from this.

> Redundantly storing the state in an extra variable opens the door to
> bugs that updates pmonr::state inconsistently with the member above.

Well, your 'infer' state from three other variables is error prone as well and
you can simply add a debug feature which makes sure that the variables are
consistent.

validate_state(p)
{
switch (p->state) {
case PMONR_UNUSED:
 WARN_ON(p-> .);

case PMONR_ACTIVE:
 WARN_ON(p-> .);
}
}

That's way better than relying on three variables which are updated here and
there to reflect the proper state.

It's not only the 3 variables which are involved there. You also have lists
and whatever which depend on this. So having a proper 'state' variable as the
central anchor gives you the ability to verify the dependent contents of your
other variables, lists etc.

Thanks,

tglx

Re: [PATCH RFC kernel] balloon: speed up inflating/deflating process

2016-05-25 Thread Michael S. Tsirkin

On Wed, May 25, 2016 at 08:48:17AM +, Li, Liang Z wrote:
> > > > Suggestion to address all above comments:
> > > > 1. allocate a bunch of pages and link them up,
> > > >calculating the min and the max pfn.
> > > >if max-min exceeds the allocated bitmap size,
> > > >tell host.
> > >
> > > I am not sure if it works well in some cases, e.g. The allocated pages
> > > are across a wide range and the max-min > limit is very frequently to be
> > true.
> > > Then, there will be many times of virtio transmission and it's bad for
> > > performance improvement. Right?
> > 
> > It's a tradeoff for sure. Measure it, see what the overhead is.
> > 
> 
> Hi MST,
> 
> I have measured the performance when using a 32K page bitmap,

Just to make sure. Do you mean a 32Kbyte bitmap?
Covering 1Gbyte of memory?

> and inflate the balloon to 3GB
> of an idle guest with 4GB RAM.

Should take 3 requests then, right?

> Now: 
> total inflating time: 338ms
> the count of virtio data transmission:  373

Why was this so high? I would expect 3 transmissions.

> the call count of madvise: 865
> 
> before:
> total inflating time: 175ms
> the count of virtio data transmission: 1
> the call count of madvise: 42
> 
> Maybe the result will be worse if the guest is not idle, or the guest has 
> more RAM.
> Do you want more data?
> 
> Is it worth to do that?
> 
> Liang

Either my math is wrong or there's an implementation bug.

> > >
> > > > 2. limit allocated bitmap size to something reasonable.
> > > >How about 32Kbytes? This is 256kilo bit in the map, which 
> > > > comes
> > > >out to 1Giga bytes of memory in the balloon.
> > >
> > > So, even the VM has 1TB of RAM, the page bitmap will take 32MB of
> > memory.
> > > Maybe it's better to use a big page bitmap the save the pages
> > > allocated by balloon, and split the big page bitmap to 32K bytes unit, 
> > > then
> > transfer one unit at a time.
> > 
> > How is this different from what I said?
> > 
> > >
> > > Should we use a page bitmap to replace 'vb->pages' ?
> > >
> > > How about rolling back to use PFNs if the count of requested pages is a
> > small number?
> > >
> > > Liang
> > 
> > That's why we have start pfn. you can use that to pass even a single page
> > without a lot of overhead.
> > 
> > > > > --
> > > > > 1.9.1
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > > > the body of a message to majord...@vger.kernel.org More majordomo
> > > > info at http://vger.kernel.org/majordomo-info.html

RE: fsl-dcu not works on latest "drm-next"

2016-05-25 Thread Meng Yi

Hi Alexander,

> From: Alexander Stein [mailto:alexander.st...@systec-electronic.com]
> Sent: Wednesday, May 25, 2016 4:32 PM
> To: Stefan Agner 
> Cc: Meng Yi ; dri-de...@lists.freedesktop.org; David Airlie
> ; airl...@redhat.com; linux-kernel@vger.kernel.org; Mark
> Brown 
> Subject: Re: fsl-dcu not works on latest "drm-next"
> 
> On Tuesday 24 May 2016 23:20:02, Stefan Agner wrote:
> > On 2016-05-24 19:14, Meng Yi wrote:
> > > I found that its regmap endianness issue, so I want to replace the
> > > "regmap".
> > Hm, replace with what? Note that we need some kind of endianness
> > convertion since the IP is big endian on LS1021a and little endian on
> > Vybrid (vf610).
> 
> Yep, regmap is required and was broken meanwhile but should be fixed now.
> See linked lkml post.
> 
> > Is it maybe just an issue with regmap/the big-endian property in the
> > device tree? Maybe this thread is interesting for you:
> > https://lkml.org/lkml/2016/3/23/233
> 
> AFAICT device tree should not been changed here. The "big-endian" property
> was there fromt he beginning.
> 
> > > I just tested the latest drm-next branch on Freescale/NXP
> > > ls1021a-twr, and got some log below. And fsl-dcu not works.
> > >
> > > Since "drm-next" merged some branch , use git bisect had some
> > > problem ,
> > >
> > > so I manually checked out that "fsl-dcu" works at
> > > d761701c55a99598477f3cb25c03d939a7711e74
> > >
> > > And not works now. some log below:
> 
> Which commit actually broke your kernel? And where to fetch it from? Is your
> problem really caused by regmap?

Since there are lots of merge commit, I had manually debugged that issue. And 
yes, it is caused by regmap.
I fetched the kernel from git://people.freedesktop.org/~airlied/linux

Best Regards,
Meng Yi

Re: livepatch: Avoid possible race when releasing the patch

2016-05-25 Thread Miroslav Benes

On Mon, 23 May 2016, Jessica Yu wrote:

> +++ Petr Mladek [23/05/16 17:54 +0200]:
> > There was a long discussion about a possible race with sysfs, kobjects
> > when removing an unused livepatch, see
> > https://lkml.kernel.org/g/%3c1462190242-24731-1-git-send-email-mbe...@suse.cz%3E
> > 
> > This patch set tries to implement what looked the most preferred solution
> > from the discussion. I did my best to keep the patch definition simple.
> > But I am not super happy with the result.
> > 
> > I send the current state before I spent even more time on different
> > approaches.
> > 
> > I personally think that we might get better result if we declare
> > some limited structures, define them statically and then copy all
> > data into the final structures in a single call. I did not implement
> > this because it was weird on the first look but I am not sure now.
> > 
> > But even more I would prefer the solution with the completion.
> > It is already used by the module framework. It does not look
> > that hacky to me after all.
> 
> Hi Petr, thanks a lot for the RFC and for exploring this possible
> solution. I haven't reviewed the patches thoroughly yet, but at first
> glance I admit that I did not think through how much this approach
> would complicate the livepatch API, and the new intermediary functions
> do seem like overkill in response to the original kobject problem..
>
> I looked at how the module loader used the completion, and in fact
> it is used to remedy a nearly identical problem with
> DEBUG_KOBJ_RELEASE (see commit 942e443 "Fix mod->mkobj.kobj
> potentially freed too early"), and Miroslav's original solution pretty
> much took the same approach. We could even mirror that approach and
> have something like klp_kobject_put() (much like mod_kobject_put()) to
> package up the kobject_put/wait_for_completion calls, but that is
> purely a matter of taste.

Hi,

I'm biased here so it is not surprising that I'd go with completion. There 
is even one more thing to be aware of. We have 'struct module *mod' in 
klp_patch and we use it throughout the code. We still need to be careful 
with it even with Petr's approach. The problem stays but it is greatly
diminished to just this one pointer. That is one can call our sysfs 
function which potentially uses mod pointer and the module could go away 
just before that.
 
> Anyway, I am just beginning to lean towards the completion solution
> again (sorry for jumping back and forth :-/), but we can play with
> this patchset a bit more and see if we can come up with something
> reasonable.

Yes. In fact it does not look that bad. Thanks Petr for doing this.

Josh, I agree that we could return to Seth's approach but it still seems a 
bit awkward. Completion is only a small modification and since module 
loader uses it itself it does not look like serious hack to me anymore.

Regards,
Miroslav

Re: [PATCH v2 5/5] usb: dwc3: rockchip: add devicetree bindings documentation

2016-05-25 Thread William Wu


Hi Felipe & Rob,
On 05/25/2016 04:04 PM, Felipe Balbi wrote:

Hi,

William Wu  writes:

Hi Felipe,

On 05/24/2016 05:32 PM, Felipe Balbi wrote:

Hi,

William Wu  writes:

This patch documents the device tree documentation required for
Rockchip USB3.0 core wrapper consist of USB3.0 IP from Synopsys.

It could operate in device mode (SS, HS, FS) and host
mode (SS, HS, FS, LS).

Signed-off-by: William Wu 
---
Changes in v2:
- add rockchip,dwc3.txt to Documentation/devicetree/bindings/ (Felipe, Brian)

   .../devicetree/bindings/usb/rockchip,dwc3.txt  | 45 
++
   1 file changed, 45 insertions(+)
   create mode 100644 Documentation/devicetree/bindings/usb/rockchip,dwc3.txt

diff --git a/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt 
b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
new file mode 100644
index 000..10303d9
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/rockchip,dwc3.txt
@@ -0,0 +1,45 @@
+Rockchip SuperSpeed DWC3 USB SoC controller
+
+Required properties:
+- compatible:  should contain "rockchip,dwc3"
+- clocks:  A list of phandle + clock-specifier pairs for the
+   clocks listed in clock-names
+- clock-names: Should contain the following:
+  "clk_usb3otg0_ref" Controller reference clk
+  "clk_usb3otg0_suspend"Controller suspend clk, can use 24 MHz or 32 KHz
+  "aclk_usb3"Master/Core clock, have to be >= 62.5 MHz for SS 
operation
+
+
+Optional clocks:
+  "aclk_usb3otg0"Aclk for specific usb controller clock.
+  "aclk_usb3_rksoc_axi_perf"  USB AXI perf clock.  Not present on all 
platforms.
+  "aclk_usb3_grf"USB grf clock.  Not present on all platforms.
+
+Required child node:
+A child node must exist to represent the core DWC3 IP block. The name of
+the node is not important. The content of the node is defined in dwc3.txt.
+
+Phy documentation is provided in the following places:
+
+Example device nodes:
+
+   usbdrd3_0: usb@fe80 {
+

no reg property?

For now, we don't need reg property here. Because we only need to do
enable some clocks and populate its children in
drivers/usb/dwc3/dwc3-of-simple.c.
And it's similar to arch/arm/boot/dts/exynos5420.dtsi usbdrd3_0 node.

compatible = "rockchip,dwc3";

+   clocks = <&cru SCLK_USB3OTG0_REF>, <&cru SCLK_USB3OTG0_SUSPEND>,
+<&cru ACLK_USB3>, <&cru ACLK_USB3OTG0>,
+<&cru ACLK_USB3_RKSOC_AXI_PERF>, <&cru ACLK_USB3_GRF>;
+   clock-names = "clk_usb3otg0_ref", "clk_usb3otg0_suspend",
+ "aclk_usb3", "aclk_usb3otg0",
+ "aclk_usb3_rksoc_axi_perf", "aclk_usb3_grf";
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+   status = "disabled";
+   usbdrd_dwc3_0: dwc3 {

no address here?

I think here don't  necessarily need address. The child node dwc3 can
inherit address from the parent node.
And with this dtsi patch, the dev path show as follows:
/sys/devices/platform/usb@fe80/fe80.dwc3

Is it need for coding style or other reason?

I don't think your arguments match what devicetree folks want to see in
DT. Let's ask them. Rob, care to look at this one?
Sorry, I need to correct myself. I have done some test, and the result 
shows that
the  child node dwc3 don't inherit address from the parent node, but get 
address
from its reg property. And It seems that whether I add address here or 
not, the

dwc3 node always get address from reg property.
However, I don't know much about the DT. But I think it's better to add 
address here than no.



+   compatible = "snps,dwc3";
+   reg = <0x0 0xfe80 0x0 0x10>;
+   interrupts = ;
+   dr_mode = "otg";
+   status = "disabled";
+   };
+   };
--
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 1/3] block: Introduce blk_bio_map_sg() to map one bio

2016-05-25 Thread Baolin Wang

On 25 May 2016 at 16:52, Ming Lei  wrote:
>>  /*
>> + * map a bio to scatterlist, return number of sg entries setup.
>> + */
>> +int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
>> +  struct scatterlist *sglist,
>> +  struct scatterlist **sg)
>> +{
>> +   struct bio_vec bvec, bvprv = { NULL };
>> +   struct bvec_iter iter;
>> +   int nsegs, cluster;
>> +
>> +   nsegs = 0;
>> +   cluster = blk_queue_cluster(q);
>> +
>> +   if (bio->bi_rw & REQ_DISCARD) {
>> +   /*
>> +* This is a hack - drivers should be neither modifying the
>> +* biovec, nor relying on bi_vcnt - but because of
>> +* blk_add_request_payload(), a discard bio may or may not 
>> have
>> +* a payload we need to set up here (thank you Christoph) and
>> +* bi_vcnt is really the only way of telling if we need to.
>> +*/
>> +
>> +   if (bio->bi_vcnt)
>> +   goto single_segment;
>> +
>> +   return 0;
>> +   }
>> +
>> +   if (bio->bi_rw & REQ_WRITE_SAME) {
>> +single_segment:
>> +   *sg = sglist;
>> +   bvec = bio_iovec(bio);
>> +   sg_set_page(*sg, bvec.bv_page, bvec.bv_len, bvec.bv_offset);
>> +   return 1;
>> +   }
>> +
>> +   bio_for_each_segment(bvec, bio, iter)
>> +   __blk_segment_map_sg(q, &bvec, sglist, &bvprv, sg,
>> +&nsegs, &cluster);
>> +
>> +   return nsegs;
>> +}
>> +EXPORT_SYMBOL(blk_bio_map_sg);
>
> You can use __blk_bios_map_sg() to implement blk_bio_map_sg(),
> then code duplication may be avoided.

OK. I'll re-factor the code to map one bio.

>
>> +
>> +/*
>>   * map a request to scatterlist, return number of sg entries setup. Caller
>>   * must make sure sg can hold rq->nr_phys_segments entries
>>   */
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index 1fd8fdf..e5de4f8 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -1013,6 +1013,9 @@ extern void blk_queue_write_cache(struct request_queue 
>> *q, bool enabled, bool fu
>>  extern struct backing_dev_info *blk_get_backing_dev_info(struct 
>> block_device *bdev);
>>
>>  extern int blk_rq_map_sg(struct request_queue *, struct request *, struct 
>> scatterlist *);
>> +extern int blk_bio_map_sg(struct request_queue *q, struct bio *bio,
>> + struct scatterlist *sglist,
>> + struct scatterlist **sg);
>>  extern void blk_dump_rq_flags(struct request *, char *);
>>  extern long nr_blockdev_pages(void);
>>
>> --
>> 1.7.9.5
>>



-- 
Baolin.wang
Best Regards

Re: [PATCH 08/10] m68k: Add

2016-05-25 Thread Philippe De Muyter

On Wed, May 25, 2016 at 03:34:55AM -0400, George Spelvin wrote:
> +static inline u32 __attribute_const__ __hash_32(u32 x)
> +{
> + u32 a, b;
> +
> + asm(   "move.l %2,%0"   /* 0x0001 */
> + "\n lsl.l #2,%0"/* 0x0004 */
> + "\n move.l %0,%1"
> + "\n lsl.l #7,%0"/* 0x0200 */
> + "\n add.l %2,%0"/* 0x0201 */
> + "\n add.l %0,%1"/* 0x0205 */
> + "\n add.l %0,%0"/* 0x0402 */
> + "\n add.l %0,%1"/* 0x0607 */
> + "\n lsl.l #5,%0"/* 0x8040 */
> + /* 0x8647 */

There is no standard way to write asm in the kernel, but I prefer
a simple semicolon after each insn

asm("move.l %2,%0;" /* 0x0001 */
"lsl.l  #2,%0;" /* 0x0004 */
"move.l %0,%1;"
"lsl.l  #7,%0;" /* 0x0200 */
"add.l  %2,%0;" /* 0x0201 */
"add.l  %0,%1;" /* 0x0205 */
"add.l  %0,%0;" /* 0x0402 */
"add.l  %0,%1;" /* 0x0607 */
"lsl.l  #5,%0"  /* 0x8040 */
/* 0x8647 */

Also, it took me some time to understand the hexadecimal constants
in the comments (and the last one predicts a future event :)).

> + : "=&d" (a), "=&r" (b)
> + : "g" (x));
> +
> + return ((u16)(x*0x61c8) << 16) + a + b;
> +}

Just my two cents

Philippe

Your Good Letter Attached

2016-05-25 Thread HSBC Bank London



HSBC Bank London Transaction Report Open Attached File



Approved Payment.pdf
Description: Adobe PDF document

Re: [PATCH 00/10] String hash improvements

2016-05-25 Thread George Spelvin

>> +#if defined(CONFIG_M68000) || defined(CONFIG_M68010)

> As I said before, I don't think you need this check, given HAVE_ARCH_HASH is
> selected by M68000, and M68010 doesn't exist.

I was going belt & suspenders on general principles, but yes, I'm happy
to leave it out.

I noticed that CONFIG_M68010 doesn't exist in Linus' tree, but you
recommended it, so I thought you might know something I don't.

> As you only include  if CONFIG_HAVE_ARCH_HASH
> is defined, you can also just call the arch-specific one .

Yes, that's a possibility, too.  But weren't we still discussing whether
I should use conditional #inclusion based on a symbol, or asm-generic?

If neither of us has a killer argument that convinces the other, style
issues like this are amenable to voting, so I was going to wait a little
bit for others to chime in.


Thank you very much for the comments!

Re: [PATCH 06/16] sched: Disable WAKE_AFFINE for asymmetric configurations

2016-05-25 Thread Morten Rasmussen

On Tue, May 24, 2016 at 05:53:27PM +0200, Vincent Guittot wrote:
> On 24 May 2016 at 17:02, Morten Rasmussen  wrote:
> > On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote:
> >> On 24 May 2016 at 15:36, Morten Rasmussen  wrote:
> >> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote:
> >> >> On 24 May 2016 at 15:16, Morten Rasmussen  
> >> >> wrote:
> >> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote:
> >> >> >> On 24 May 2016 at 12:29, Morten Rasmussen  
> >> >> >> wrote:
> >> >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote:
> >> >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen 
> >> >> >> >>  wrote:
> >> >> >> >> > If the system has cpu of different compute capacities (e.g. 
> >> >> >> >> > big.LITTLE)
> >> >> >> >> > let affine wakeups be constrained to cpus of the same type.
> >> >> >> >>
> >> >> >> >> Can you explain why you don't want wake affine with cpus with
> >> >> >> >> different compute capacity ?
> >> >> >> >
> >> >> >> > I should have made the overall idea a bit more clear. The idea is 
> >> >> >> > to
> >> >> >> > deal with cross-capacity migrations in the find_idlest_{group, 
> >> >> >> > cpu}{}
> >> >> >> > path so we don't have to touch select_idle_sibling().
> >> >> >> > select_idle_sibling() is critical for wake-up latency, and I'm 
> >> >> >> > assumed
> >> >> >> > that people wouldn't like adding extra overhead in there to deal 
> >> >> >> > with
> >> >> >> > capacity and utilization.
> >> >> >>
> >> >> >> So this means that we will never use the quick path of
> >> >> >> select_idle_sibling for cross capacity migration but always the one
> >> >> >> with extra overhead?
> >> >> >
> >> >> > Yes. select_idle_sibling() is only used to choose among equal capacity
> >> >> > cpus (capacity_orig).
> >> >> >
> >> >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also
> >> >> >> be used for cross capacity migration ? so we can use wake_affine if
> >> >> >> the task or the cpus (even with different capacity) doesn't need this
> >> >> >> extra overhead
> >> >> >
> >> >> > The test in patch 9 is to determine whether we are happy with the
> >> >> > capacity of the previous cpu, or we should go look for one with more
> >> >> > capacity. I don't see how we can use select_idle_sibling() unmodified
> >> >> > for sched domains containing cpus of different capacity to select an
> >> >> > appropriate cpu. It is just picking an idle cpu, it might have high
> >> >> > capacity or low, it wouldn't care.
> >> >> >
> >> >> > How would you avoid the overhead of checking capacity and utilization 
> >> >> > of
> >> >> > the cpus and still pick an appropriate cpu?
> >> >>
> >> >> My point is that there is some wake up case where we don't care about
> >> >> the capacity and utilization of cpus even for cross capacity migration
> >> >> and we will never take benefit of this fast path.
> >> >> You have added an extra check for setting want_affine in patch 9 which
> >> >> uses capacity and utilization of cpu to disable this fast path when a
> >> >> task needs more capacity than available. Can't you use this function
> >> >> to disable the want_affine for cross-capacity migration situation that
> >> >> cares of the capacity and need the full scan of sched_domain but keep
> >> >> it enable for other cases ?
> >> >
> >> > It is not clear to me what the other cases are. What kind of cases do
> >> > you have in mind?
> >>
> >> As an example, you have a task A that have to be on a big CPU because
> >> of the requirement of compute capacity, that wakes up a task B that
> >> can run on any cpu according to its utilization. The fast wake up path
> >> is fine for task B whatever prev cpu is.
> >
> > In that case, we will take always take fast path (select_idle_sibling())
> > for task B if wake_wide() allows it, which should be fine.
> 
> Even if want_affine is set, the wake up of task B will not use the fast path.
> The affine_sd will not be set because the sched_domain, which have
> both cpus, will not have the SD_WAKE_AFFINE flag according to this
> patch, isn't it ?
> So task B can't use the fast path whereas nothing prevent him to take
> benefit of it
> 
> Am I missing something ?

No, I think you are right. Very good point. The cpumask test with
sched_domain_span() will of cause return false. So yes, in this case the
slow path is taken. It isn't wrong as such, just slower for asymmetric
capacity systems :-)

It is clearly not as optimized for asymmetric capacity systems as it
could be, but my focus was to not ruin existing behaviour and minimize
overhead for others. There are a lot of different routes through those
conditions in the first half of select_task_rq_fair() that aren't
obvious. I worry that some users depend on them and that I don't
see/understand all of them.

If people agree on changing things, it is fine with me. I just tried to
avoid getting the patches shot down on that account ;-)

Re: [PATCH 08/10] m68k: Add

2016-05-25 Thread George Spelvin

> On Wed, May 25, 2016 at 03:34:55AM -0400, George Spelvin wrote:
>> +static inline u32 __attribute_const__ __hash_32(u32 x)
>> +{
>> +u32 a, b;
>> +
>> +asm(   "move.l %2,%0"   /* 0x0001 */
>> +"\n lsl.l #2,%0"/* 0x0004 */
>> +"\n move.l %0,%1"
>> +"\n lsl.l #7,%0"/* 0x0200 */
>> +"\n add.l %2,%0"/* 0x0201 */
>> +"\n add.l %0,%1"/* 0x0205 */
>> +"\n add.l %0,%0"/* 0x0402 */
>> +"\n add.l %0,%1"/* 0x0607 */
>> +"\n lsl.l #5,%0"/* 0x8040 */
>> +/* 0x8647 */

> There is no standard way to write asm in the kernel, but I prefer
> a simple semicolon after each insn

I did it the way I did above because it makes the gcc -S output very
legible.  Just like I put a space before the perands on m68k but a tab
on h8300: that's what GCC does on those platforms.

I started with the "\n\t" suffixes on each line like so much other
kernel code, but then figured out the format above which is legible
both in C source and compiler output.

>>  asm("move.l %2,%0;" /* 0x0001 */
>>  "lsl.l  #2,%0;" /* 0x0004 */
>>  "move.l %0,%1;"
>>  "lsl.l  #7,%0;" /* 0x0200 */
>>  "add.l  %2,%0;" /* 0x0201 */
>>  "add.l  %0,%1;" /* 0x0205 */
>>  "add.l  %0,%0;" /* 0x0402 */
>>  "add.l  %0,%1;" /* 0x0607 */
>>  "lsl.l  #5,%0"  /* 0x8040 */
>>  /* 0x8647 */

> Also, it took me some time to understand the hexadecimal constants
> in the comments (and the last one predicts a future event :)).

Can you recmmend a better way to comment this?  My nose is so deep
in the code it's hard for me to judge.

> Just my two cents

And thank you very much for them!

Re: fsl-dcu not works on latest "drm-next"

2016-05-25 Thread Mark Brown

On Wed, May 25, 2016 at 02:14:09AM +, Meng Yi wrote:

Please don't top post, reply in line with needed context.  This allows
readers to readily follow the flow of conversation and understand what
you are talking about and also helps ensure that everything in the
discussion is being addressed.

> Regmap endianness issue had caused some other drivers not work, like SPI etc. 
> Or this is fixed and I just don't know?

Without any description of the problem it is difficult to comment.
There were some drivers that were abusing the API by hacking round
things that need fixing (the main one I've seen is reporting things as
big endian instead of native endian to cause two layers of translation
to kick in) and one that was trying to use regmap to represent something
that just fundamentally wasn't a regmap so *any* change in regmap
internals was risky.

signature.asc
Description: PGP signature

[PATCH] Drivers: hv: avoid vfree() on crash

2016-05-25 Thread Vitaly Kuznetsov

When we crash from NMI context (e.g. after NMI injection from host when
'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit

kernel BUG at mm/vmalloc.c:1530!

as vfree() is denied. While the issue could be solved with in_nmi() check
instead I opted for skipping vfree on all sorts of crashes to reduce the
amount of work which can cause consequent crashes. We don't really need to
free anything on crash.

Signed-off-by: Vitaly Kuznetsov 
---
 drivers/hv/hv.c   | 8 +---
 drivers/hv/hyperv_vmbus.h | 2 +-
 drivers/hv/vmbus_drv.c| 8 
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index a1c086b..60dbd6c 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -278,7 +278,7 @@ cleanup:
  *
  * This routine is called normally during driver unloading or exiting.
  */
-void hv_cleanup(void)
+void hv_cleanup(bool crash)
 {
union hv_x64_msr_hypercall_contents hypercall_msr;
 
@@ -288,7 +288,8 @@ void hv_cleanup(void)
if (hv_context.hypercall_page) {
hypercall_msr.as_uint64 = 0;
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
-   vfree(hv_context.hypercall_page);
+   if (!crash)
+   vfree(hv_context.hypercall_page);
hv_context.hypercall_page = NULL;
}
 
@@ -308,7 +309,8 @@ void hv_cleanup(void)
 
hypercall_msr.as_uint64 = 0;
wrmsrl(HV_X64_MSR_REFERENCE_TSC, hypercall_msr.as_uint64);
-   vfree(hv_context.tsc_page);
+   if (!crash)
+   vfree(hv_context.tsc_page);
hv_context.tsc_page = NULL;
}
 #endif
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 718b5c7..dfa9fac 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -495,7 +495,7 @@ struct hv_ring_buffer_debug_info {
 
 extern int hv_init(void);
 
-extern void hv_cleanup(void);
+extern void hv_cleanup(bool crash);
 
 extern int hv_post_message(union hv_connection_id connection_id,
 enum hv_message_type message_type,
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 952f20f..d11690e 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -871,7 +871,7 @@ err_alloc:
bus_unregister(&hv_bus);
 
 err_cleanup:
-   hv_cleanup();
+   hv_cleanup(false);
 
return ret;
 }
@@ -1323,7 +1323,7 @@ static void hv_kexec_handler(void)
vmbus_initiate_unload(false);
for_each_online_cpu(cpu)
smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
-   hv_cleanup();
+   hv_cleanup(false);
 };
 
 static void hv_crash_handler(struct pt_regs *regs)
@@ -1335,7 +1335,7 @@ static void hv_crash_handler(struct pt_regs *regs)
 * for kdump.
 */
hv_synic_cleanup(NULL);
-   hv_cleanup();
+   hv_cleanup(true);
 };
 
 static int __init hv_acpi_init(void)
@@ -1395,7 +1395,7 @@ static void __exit vmbus_exit(void)
 &hyperv_panic_block);
}
bus_unregister(&hv_bus);
-   hv_cleanup();
+   hv_cleanup(false);
for_each_online_cpu(cpu) {
tasklet_kill(hv_context.event_dpc[cpu]);
smp_call_function_single(cpu, hv_synic_cleanup, NULL, 1);
-- 
2.5.5

Re: [PATCH] devicetree - document using aliases to set spi bus number.

2016-05-25 Thread Mark Rutland

On Tue, May 24, 2016 at 01:41:26PM -0700, Frank Rowand wrote:
> On 5/24/2016 10:41 AM, Mark Rutland wrote:
> > On Tue, May 24, 2016 at 06:39:20PM +0200, Christer Weinigel wrote:
> >> Document how to use devicetree aliases to assign a stable
> >> bus number to a spi bus.
> >>
> >> Signed-off-by: Christer Weinigel 
> >>
> >> ---
> >>
> >> Trivial documentation change.
> >>
> >> Not having used devicetree that much it was surprisingly hard to
> >> figure out how to assign a stable bus number to a spi bus.  Add a
> >> simple example that shows how to do that.
> >>
> >> Mark Cced as the SPI maintainer.  Or should trivial documentation
> >> fixes like this be addressed to someone else?
> >>
> >>   /Christer
> >>
> >>  Documentation/devicetree/bindings/spi/spi-bus.txt | 10 ++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/spi/spi-bus.txt 
> >> b/Documentation/devicetree/bindings/spi/spi-bus.txt
> >> index 42d5954..c35c4c2 100644
> >> --- a/Documentation/devicetree/bindings/spi/spi-bus.txt
> >> +++ b/Documentation/devicetree/bindings/spi/spi-bus.txt
> >> @@ -94,3 +94,13 @@ SPI example for an MPC5200 SPI bus:
> >>reg = <1>;
> >>};
> >>};
> >> +
> >> +Normally SPI buses are assigned dynamic bus numbers starting at 32766
> >> +and counting downwards.  It is possible to assign the bus number
> >> +statically using devicetee aliases.  For example, on the MPC5200 the
> >> +"spi@f00" device above is connected to the "soc" bus.  To set its
> >> +bus_num to 1 add an aliases entry like this:
> > 
> > As Mark Brown pointed out, this is very Linux-specific (at least in the
> > wording of the above).
> 
> Yes, Linux-specific.  So the Linux documentation of bindings is the
> correct place for it.

I don't entirely agree. Which is not to say that I disagree as such, but
rather that this is not a black-and-white affair.

While bindings do happen to live in the kernel tree, we try to keep them
separate from Linux internals or Linux API details that are outside of
the scope of the HW/kernel interface. There are certainly reasons to
describe Linux-specific bindings (e.g. things under /chosen).

Mark Brown's comments imply that there is a better mechanism which does
not rely on this binding, so even if we must retain support for it in
Linux for legacy reasons, documenting it as a binding is not necessarily
in anyone's best interest. If we want to document it, we may want to
mark it as deprecated, with a pointer to better alternatives.

> > Generally, aliases are there to match _physical_ identifiers (e.g. to
> > match physical labels for UART0, UART1, and on).
> > 
> > I'm not sure whether that applies here.
> 
> The code and behavior is in the Linux kernel. It should be visible in
> the documentation instead of being a big mystery of how it works.

As above, I don't entirely agree. Mindlessly documenting existing Linux
behaviour can have the unfortuante effect of moving people towards the
wrong tool for the job.

Thanks,
Mark.

Re: [PATCH 07/16] sched: Make SD_BALANCE_WAKE a topology flag

2016-05-25 Thread Morten Rasmussen

On Wed, May 25, 2016 at 07:52:49AM +0800, Yuyang Du wrote:
> On Mon, May 23, 2016 at 11:58:49AM +0100, Morten Rasmussen wrote:
> > For systems with the SD_ASYM_CPUCAPACITY flag set on higher level in the
> > sched_domain hierarchy we need a way to enable wake-up balancing for the
> > lower levels as well as we may want to balance tasks that don't fit the
> > capacity of the previous cpu.
> > 
> > We have the option of introducing a new topology flag to express this
> > requirement, or let the existing SD_BALANCE_WAKE flag be set by the
> > architecture as a topology flag. The former means introducing yet
> > another flag, the latter breaks the current meaning of topology flags.
> > None of the options are really desirable.
>  
> I'd propose to replace SD_WAKE_AFFINE with SD_BALANCE_WAKE. And the
> SD_WAKE_AFFINE semantic is simply "waker allowed":
> 
> waker_allowed = cpumask_test_cpu(cpu, tsk_cpus_allowed(p));
> 
> This can be implemented without current functionality change.
> 
> From there, the choice between waker and wakee, and fast path
> select_idle_sibling() and the rest slow path should be reworked, which
> I am thinking about.

I don't really understand how that would work. If you change the
semantics of the flags you don't preserve current behaviour. To me it
sounds like at total rewrite of everything.

SD_BALANCE_WAKE controls whether we go slow path or not in case
want_affine is false. SD_WAKE_AFFINE controls whether we should consider
waking up near the waker instead of always waking up near the previous
cpu.

RE: [PATCH RFC kernel] balloon: speed up inflating/deflating process

2016-05-25 Thread Li, Liang Z

> On Wed, May 25, 2016 at 08:48:17AM +, Li, Liang Z wrote:
> > > > > Suggestion to address all above comments:
> > > > >   1. allocate a bunch of pages and link them up,
> > > > >  calculating the min and the max pfn.
> > > > >  if max-min exceeds the allocated bitmap size,
> > > > >  tell host.
> > > >
> > > > I am not sure if it works well in some cases, e.g. The allocated
> > > > pages are across a wide range and the max-min > limit is very
> > > > frequently to be
> > > true.
> > > > Then, there will be many times of virtio transmission and it's bad
> > > > for performance improvement. Right?
> > >
> > > It's a tradeoff for sure. Measure it, see what the overhead is.
> > >
> >
> > Hi MST,
> >
> > I have measured the performance when using a 32K page bitmap,
> 
> Just to make sure. Do you mean a 32Kbyte bitmap?
> Covering 1Gbyte of memory?
Yes.

> 
> > and inflate the balloon to 3GB
> > of an idle guest with 4GB RAM.
> 
> Should take 3 requests then, right?
> 

No,  we can't assign the PFN when allocating page in balloon driver,
So the PFNs of pages allocated may be across a large range,  we will
tell the host once the pfn_max -pfn_min >= 0x4(1GB range),
so the requests count is most likely to be more than 3. 


> > Now:
> > total inflating time: 338ms
> > the count of virtio data transmission:  373
> 
> Why was this so high? I would expect 3 transmissions.

I follow your suggestion:

Suggestion to address all above comments:
1. allocate a bunch of pages and link them up,
   calculating the min and the max pfn.
   if max-min exceeds the allocated bitmap size,
   tell host.
2. limit allocated bitmap size to something reasonable.
   How about 32Kbytes? This is 256kilo bit in the map, which comes
   out to 1Giga bytes of memory in the balloon.
-
Because the PFNs of the allocated pages are not linear increased, so 3 
transmissions
are  impossible.


Liang

> 
> > the call count of madvise: 865
> >
> > before:
> > total inflating time: 175ms
> > the count of virtio data transmission: 1 the call count of madvise: 42
> >
> > Maybe the result will be worse if the guest is not idle, or the guest has
> more RAM.
> > Do you want more data?
> >
> > Is it worth to do that?
> >
> > Liang
> 
> Either my math is wrong or there's an implementation bug.
> 
> > > >
> > > > >   2. limit allocated bitmap size to something reasonable.
> > > > >  How about 32Kbytes? This is 256kilo bit in the map, which 
> > > > > comes
> > > > >  out to 1Giga bytes of memory in the balloon.
> > > >
> > > > So, even the VM has 1TB of RAM, the page bitmap will take 32MB of
> > > memory.
> > > > Maybe it's better to use a big page bitmap the save the pages
> > > > allocated by balloon, and split the big page bitmap to 32K bytes
> > > > unit, then
> > > transfer one unit at a time.
> > >
> > > How is this different from what I said?
> > >
> > > >
> > > > Should we use a page bitmap to replace 'vb->pages' ?
> > > >
> > > > How about rolling back to use PFNs if the count of requested pages
> > > > is a
> > > small number?
> > > >
> > > > Liang
> > >
> > > That's why we have start pfn. you can use that to pass even a single
> > > page without a lot of overhead.
> > >
> > > > > > --
> > > > > > 1.9.1
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe kvm"
> > > > > in the body of a message to majord...@vger.kernel.org More
> > > > > majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in the body of
> a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

Re: Builtin microcode does nothing..

2016-05-25 Thread Borislav Petkov

On Sat, May 21, 2016 at 09:51:18AM +0200, Borislav Petkov wrote:
> I'll ping you once I'm done testing here.

Ok, I've just uploaded a branch, it passes testing here.

http://git.kernel.org/cgit/linux/kernel/git/bp/bp.git, branch tip-microcode

@Jim, I'd appreciate it if you ran it again, if you get a chance, to
confirm everything is still ok.

Thanks.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--

Re: can't boot with reiserfs on linux-4.6.0+

2016-05-25 Thread Jeff Chua

On Wed, May 25, 2016 at 2:37 AM, Al Viro  wrote:
> On Tue, May 24, 2016 at 04:59:02PM +0100, Al Viro wrote:
>
>> Umm...  Any chance of getting the function names to go with the addresses?
>> I'll try to reproduce it here, but the things would be easier with that
>> information...
>
> See if this fixes your reproducer.
>
> diff --git a/fs/xattr.c b/fs/xattr.c
> index b11945e..49b8eab 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -667,6 +667,9 @@ xattr_resolve_name(const struct xattr_handler **handlers, 
> const char **name)
>  {
> const struct xattr_handler *handler;
>
> +   if (!handlers)
> +   return NULL;
> +
> if (!*name)
> return NULL;
>

Tried, but doesn't work.

Here's dmesg with symbols ...


[   35.565534] BUG: unable to handle kernel NULL pointer dereference
at 0020
[   35.566200] IP: [] generic_getxattr+0x4f/0x5d
[   35.566828] PGD 409992067 PUD 409993067 PMD 0
[   35.567469] Oops:  [#1] SMP
[   35.568082] Modules linked in: usbhid
[   35.568731] CPU: 1 PID: 1873 Comm: bash Not tainted 4.6.0 #5
[   35.569339] Hardware name: LENOVO 20F5000RSG/20F5000RSG, BIOS
R02ET44W (1.17 ) 01/25/2016
[   35.569981] task: 88040c3f2580 ti: 88040990c000 task.ti:
88040990c000
[   35.570603] RIP: 0010:[]  []
generic_getxattr+0x4f/0x5d
[   35.571246] RSP: 0018:88040990fdd8  EFLAGS: 00010207
[   35.571843] RAX:  RBX: 88041043d6c0 RCX: 819e2917
[   35.572436] RDX: 8804104b4310 RSI: 88041043d6c0 RDI: 
[   35.573085] RBP: 8804104b4310 R08: 88040990fe0c R09: 0014
[   35.573673] R10:  R11:  R12: 88040990fe0c
[   35.574257] R13: 88040e60a6c0 R14: 0022 R15: 
[   35.574868] FS:  7f092f53e700() GS:88042144()
knlGS:
[   35.575446] CS:  0010 DS:  ES:  CR0: 80050033
[   35.576013] CR2: 0020 CR3: 000409991000 CR4: 003406e0
[   35.576621] DR0:  DR1:  DR2: 
[   35.577186] DR3:  DR6: fffe0ff0 DR7: 0400
[   35.577748] Stack:
[   35.578342]  0014 819e2917 88040990fe3c

[   35.578960]  8800d25ce600 8123 810c75a2

[   35.579583]   88040e607000 81299bc1

[   35.580172] Call Trace:
[   35.580749]  [] ? get_vfs_caps_from_disk+0x51/0xcf
[   35.581365]  [] ? __vma_link_rb+0x58/0x73
[   35.581933]  [] ? cap_bprm_set_creds+0x1b0/0x420
[   35.582504]  [] ? prepare_binprm+0xce/0x107
[   35.583095]  [] ? do_execveat_common.isra.49+0x3d0/0x5b4
[   35.583657]  [] ? do_execve+0x1a/0x1c
[   35.584248]  [] ? SyS_execve+0x23/0x2a
[   35.584801]  [] ? do_syscall_64+0x51/0x89
[   35.585345]  [] ? entry_SYSCALL64_slow_path+0x25/0x25
[   35.585882] Code: 8b b8 a0 00 00 00 e8 6c fc ff ff 4c 8b 04 24 48
3d 00 f0 ff ff 77 19 4d 89 c1 48 8b 4c 24 08 4d 89 e0 48 89 ea 48 89
de 48 89 c7  50 20 48 98 48 83 c4 10 5b 5d 41 5c c3 41 54 48 c7 c0
18 4e
[   35.587155] RIP  [] generic_getxattr+0x4f/0x5d
[   35.587776]  RSP 
[   35.588351] CR2: 0020
[   35.588974] ---[ end trace 1ac6eb2a9a9b2964 ]---

Thanks,
Jeff

Re: [PATCH v3 2/2] i2c: qup: support SMBus block read

2016-05-25 Thread rajeev kumar

On Fri, May 20, 2016 at 3:14 AM, Austin Christ  wrote:
> From: Naveen Kaje 
>
> I2C QUP driver relies on SMBus emulation support from the framework.
> To handle SMBus block reads, the driver should check I2C_M_RECV_LEN
> flag and should read the first byte received as the message length.
>
> The driver configures the QUP hardware to read one byte. Once the
> message length is known from this byte, the QUP hardware is configured
> to read the rest.
>
> Signed-off-by: Naveen Kaje 
> Signed-off-by: Austin Christ 
> ---
>  drivers/i2c/busses/i2c-qup.c | 68 
> ++--
>  1 file changed, 65 insertions(+), 3 deletions(-)
>
> Changes:
> - v3:
>  - clean up redundant checks
>  - use constant instead of variable for smbus length field
> - v2:
>  - rework the smbus block read and break into separate function
>
> diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
> index ea6ca5f..9fbed83 100644
> --- a/drivers/i2c/busses/i2c-qup.c
> +++ b/drivers/i2c/busses/i2c-qup.c
> @@ -517,6 +517,33 @@ static int qup_i2c_get_data_len(struct qup_i2c_dev *qup)
> return data_len;
>  }
>
> +static bool qup_i2c_check_msg_len(struct i2c_msg *msg)
> +{
> +   return ((msg->flags & I2C_M_RD) && (msg->flags & I2C_M_RECV_LEN));
> +}
> +
> +static int qup_i2c_set_tags_smb(u16 addr, u8 *tags, struct qup_i2c_dev *qup,
> +   struct i2c_msg *msg)
> +{
> +   int len = 0;
> +
> +   if (msg->len > 1) {
> +   tags[len++] = QUP_TAG_V2_DATARD_STOP;
> +   tags[len++] = qup_i2c_get_data_len(qup) - 1;
> +   } else {
> +   tags[len++] = QUP_TAG_V2_START;
> +   tags[len++] = addr & 0xff;
> +
> +   if (msg->flags & I2C_M_TEN)
> +   tags[len++] = addr >> 8;
> +
> +   tags[len++] = QUP_TAG_V2_DATARD;
> +   /* Read 1 byte indicating the length of the SMBus message */
> +   tags[len++] = 1;
> +   }
> +   return len;
> +}
> +
>  static int qup_i2c_set_tags(u8 *tags, struct qup_i2c_dev *qup,
> struct i2c_msg *msg,  int is_dma)
>  {
> @@ -526,6 +553,10 @@ static int qup_i2c_set_tags(u8 *tags, struct qup_i2c_dev 
> *qup,
>
> int last = (qup->blk.pos == (qup->blk.count - 1)) && (qup->is_last);
>
> +   /* Handle tags for SMBus block read */
> +   if (qup_i2c_check_msg_len(msg))
> +   return qup_i2c_set_tags_smb(addr, tags, qup, msg);
> +
> if (qup->blk.pos == 0) {
> tags[len++] = QUP_TAG_V2_START;
> tags[len++] = addr & 0xff;
> @@ -1065,9 +1096,17 @@ static int qup_i2c_read_fifo_v2(struct qup_i2c_dev 
> *qup,
> struct i2c_msg *msg)
>  {
> u32 val;
> -   int idx, pos = 0, ret = 0, total;
> +   int idx, pos = 0, ret = 0, total, msg_offset = 0;
>
> +   /*
> +* If the message length is already read in
> +* the first byte of the buffer, account for
> +* that by setting the offset
> +*/
> +   if (qup_i2c_check_msg_len(msg) && (msg->len > 1))
> +   msg_offset = 1;
> total = qup_i2c_get_data_len(qup);
> +   total -= msg_offset;
>
> /* 2 extra bytes for read tags */
> while (pos < (total + 2)) {
> @@ -1087,8 +1126,8 @@ static int qup_i2c_read_fifo_v2(struct qup_i2c_dev *qup,
>
> if (pos >= (total + 2))
> goto out;
> -
> -   msg->buf[qup->pos++] = val & 0xff;
> +   msg->buf[qup->pos + msg_offset] = val & 0xff;
> +   qup->pos++;
> }
> }
>
> @@ -1128,6 +1167,24 @@ static int qup_i2c_read_one_v2(struct qup_i2c_dev 
> *qup, struct i2c_msg *msg)
> goto err;
>
> qup->blk.pos++;
> +
> +   /* Handle SMBus block read length */
> +   if (qup_i2c_check_msg_len(msg) && (msg->len == 1)) {
> +   if (msg->buf[0] > I2C_SMBUS_BLOCK_MAX) {
> +   ret = -EPROTO;
> +   goto err;
> +   }
> +   msg->len += msg->buf[0];
> +   qup->pos = 0;
> +   qup_i2c_set_read_mode_v2(qup, msg->len);
> +   ret = qup_i2c_issue_xfer_v2(qup, msg);
> +   if (ret)
> +   goto err;
> +   ret = qup_i2c_wait_for_complete(qup, msg);
> +   if (ret)
> +   goto err;
> +   qup_i2c_set_blk_data(qup, msg);
> +   }
> } while (qup->blk.pos < qup->blk.count);
>
>  err:
> @@ -1210,6 +1267,11 @@ static int qup_i2c_xfer(struct i2c_adapter *adap,
> goto out;
> }
>
> +   if (qup_i2c_check_msg_len(&msgs[idx])) {
> +

Re: [PATCH 08/10] m68k: Add

2016-05-25 Thread Andreas Schwab

"George Spelvin"  writes:

> Can you recmmend a better way to comment this?  My nose is so deep
> in the code it's hard for me to judge.

It's probably best to express the effect of the insns in plain C.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: [PATCH 3/3] clk: samsung: exynos5433: add CPU clocks configuration data and instantiate CPU clocks

2016-05-25 Thread Krzysztof Kozlowski

On 05/24/2016 03:19 PM, Bartlomiej Zolnierkiewicz wrote:
> Add the CPU clocks configuration data and instantiate the CPU clocks
> type for Exynos5433.
> 
> Cc: Kukjin Kim 
> CC: Krzysztof Kozlowski 
> Signed-off-by: Bartlomiej Zolnierkiewicz 
> ---
>  drivers/clk/samsung/clk-exynos5433.c | 72 
> 
>  1 file changed, 64 insertions(+), 8 deletions(-)
> 

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

Re: [PATCH 1/7] x86/xen: Simplify set_aliased_prot

2016-05-25 Thread Andrew Cooper

On 24/05/16 23:48, Andy Lutomirski wrote:
> In aa1acff356bb ("x86/xen: Probe target addresses in
> set_aliased_prot() before the hypercall"), I added an explicit probe
> to work around a hypercall issue.  The code can be simplified by
> using probe_kernel_read.
>
> Cc: Andrew Cooper 
> Cc: Boris Ostrovsky 
> Cc: David Vrabel 
> Cc: Jan Beulich 
> Cc: Konrad Rzeszutek Wilk 
> Cc: xen-devel 
> Signed-off-by: Andy Lutomirski 

Reviewed-by: Andrew Cooper

Re: [PATCH RFC kernel] balloon: speed up inflating/deflating process

2016-05-25 Thread Michael S. Tsirkin

On Wed, May 25, 2016 at 09:28:58AM +, Li, Liang Z wrote:
> > On Wed, May 25, 2016 at 08:48:17AM +, Li, Liang Z wrote:
> > > > > > Suggestion to address all above comments:
> > > > > > 1. allocate a bunch of pages and link them up,
> > > > > >calculating the min and the max pfn.
> > > > > >if max-min exceeds the allocated bitmap size,
> > > > > >tell host.
> > > > >
> > > > > I am not sure if it works well in some cases, e.g. The allocated
> > > > > pages are across a wide range and the max-min > limit is very
> > > > > frequently to be
> > > > true.
> > > > > Then, there will be many times of virtio transmission and it's bad
> > > > > for performance improvement. Right?
> > > >
> > > > It's a tradeoff for sure. Measure it, see what the overhead is.
> > > >
> > >
> > > Hi MST,
> > >
> > > I have measured the performance when using a 32K page bitmap,
> > 
> > Just to make sure. Do you mean a 32Kbyte bitmap?
> > Covering 1Gbyte of memory?
> Yes.
> 
> > 
> > > and inflate the balloon to 3GB
> > > of an idle guest with 4GB RAM.
> > 
> > Should take 3 requests then, right?
> > 
> 
> No,  we can't assign the PFN when allocating page in balloon driver,
> So the PFNs of pages allocated may be across a large range,  we will
> tell the host once the pfn_max -pfn_min >= 0x4(1GB range),
> so the requests count is most likely to be more than 3. 
> 
> > > Now:
> > > total inflating time: 338ms
> > > the count of virtio data transmission:  373
> > 
> > Why was this so high? I would expect 3 transmissions.
> 
> I follow your suggestion:
> 
> Suggestion to address all above comments:
>   1. allocate a bunch of pages and link them up,
>  calculating the min and the max pfn.
>  if max-min exceeds the allocated bitmap size,
>  tell host.
>   2. limit allocated bitmap size to something reasonable.
>  How about 32Kbytes? This is 256kilo bit in the map, which comes
>  out to 1Giga bytes of memory in the balloon.
> -
> Because the PFNs of the allocated pages are not linear increased, so 3 
> transmissions
> are  impossible.
> 
> 
> Liang

Interesting. How about instead of tell host, we do multiple scans, each
time ignoring pages out of range?

for (pfn = min pfn; pfn < max pfn; pfn += 1G) {
foreach page
if page pfn < pfn || page pfn >= pfn + 1G
continue
set bit
tell host
}

> 
> > 
> > > the call count of madvise: 865
> > >
> > > before:
> > > total inflating time: 175ms
> > > the count of virtio data transmission: 1 the call count of madvise: 42
> > >
> > > Maybe the result will be worse if the guest is not idle, or the guest has
> > more RAM.
> > > Do you want more data?
> > >
> > > Is it worth to do that?
> > >
> > > Liang
> > 
> > Either my math is wrong or there's an implementation bug.
> > 
> > > > >
> > > > > > 2. limit allocated bitmap size to something reasonable.
> > > > > >How about 32Kbytes? This is 256kilo bit in the map, which 
> > > > > > comes
> > > > > >out to 1Giga bytes of memory in the balloon.
> > > > >
> > > > > So, even the VM has 1TB of RAM, the page bitmap will take 32MB of
> > > > memory.
> > > > > Maybe it's better to use a big page bitmap the save the pages
> > > > > allocated by balloon, and split the big page bitmap to 32K bytes
> > > > > unit, then
> > > > transfer one unit at a time.
> > > >
> > > > How is this different from what I said?
> > > >
> > > > >
> > > > > Should we use a page bitmap to replace 'vb->pages' ?
> > > > >
> > > > > How about rolling back to use PFNs if the count of requested pages
> > > > > is a
> > > > small number?
> > > > >
> > > > > Liang
> > > >
> > > > That's why we have start pfn. you can use that to pass even a single
> > > > page without a lot of overhead.
> > > >
> > > > > > > --
> > > > > > > 1.9.1
> > > > > > --
> > > > > > To unsubscribe from this list: send the line "unsubscribe kvm"
> > > > > > in the body of a message to majord...@vger.kernel.org More
> > > > > > majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in the body 
> > of
> > a message to majord...@vger.kernel.org More majordomo info at
> > http://vger.kernel.org/majordomo-info.html

Darlehen

2016-05-25 Thread CITY FINANCE LIMITED®




Grüße Herr / Frau:

  Ich bin Justins Morgam, aus privaten Darlehen Unternehmen CITY FINANCE, mit 
Sitz in Großbritannien .
Wir bieten alle Arten von Darlehen an Privatpersonen und Unternehmen, Machen 
Sie es sich finanziell
stabil durch ein Darlehen von CITY FINANCE bei 3% Zinssatz wie kurz zu erhalten 
und
Langfristigen Darlehen, Darlehen Unternehmen, Haus Darlehen. Auto-Darlehen usw.

   Bitte füllen Sie das unten nur das Darlehen Antragsformular , wenn Sie in 
das Darlehen interessiert sind,
* Name:
* Geschlecht:
* Land / Adresse:
* Betrag benötigt:
* Dauer der Darlehen:
* Der Grund für den Kredit:
* Tel:

Kontaktieren Sie uns per E-Mail royally_bnk_...@webadicta.org / +447053857786
Danke und viele Grüße.
Justins Morgam.

Re: [Intel-gfx] [v4.6-10530-g28165ec7a99b] i915: ERROR "CPU pipe/PCH transcoder" A FIFO underrun

2016-05-25 Thread Jani Nikula

On Wed, 25 May 2016, Sedat Dilek  wrote:
> Hi Daniel,
>
> with latest Linus Git I see this with my Intel SandyBridge GPU...
>
> [   17.629014] [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
> *ERROR* CPU pipe A FIFO underrun
> [   17.630652] [drm:intel_set_pch_fifo_underrun_reporting [i915]]
> *ERROR* uncleared pch fifo underrun on pch transcoder A
> [   17.630685] [drm:intel_pch_fifo_underrun_irq_handler [i915]]
> *ERROR* PCH transcoder A FIFO underrun
>
> Attached are my linux-config, dmesg-output anx Xorg-log.
>
> Any other informations you need?

For starters, please try the fixes pull I just sent on top [1]. There's
a couple of watermark fixes.

BR,
Jani.



[1] http://mid.gmane.org/87zirebjm7@intel.com



-- 
Jani Nikula, Intel Open Source Technology Center

[PATCH] posix-cpu-timers: Fix WARNINGs for 'sizeof(X)' instead of 'sizeof X' in posix-cpu-timers.c

2016-05-25 Thread Wei Tang

This patch fixes the checkpatch.pl WARNINGs to posix-cpu-timers.c like:

WARNING: sizeof timer should be sizeof(timer)

Signed-off-by: Wei Tang 
---
 kernel/time/posix-cpu-timers.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 1cafba8..4803c65 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1271,7 +1271,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, 
int flags,
/*
 * Set up a temporary timer and then wait for it to go off.
 */
-   memset(&timer, 0, sizeof timer);
+   memset(&timer, 0, sizeof(timer));
spin_lock_init(&timer.it_lock);
timer.it_clock = which_clock;
timer.it_overrun = -1;
@@ -1280,7 +1280,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, 
int flags,
if (!error) {
static struct itimerspec zero_it;
 
-   memset(it, 0, sizeof *it);
+   memset(it, 0, sizeof(*it));
it->it_value = *rqtp;
 
spin_lock_irq(&timer.it_lock);
@@ -1373,7 +1373,7 @@ static int posix_cpu_nsleep(const clockid_t which_clock, 
int flags,
/*
 * Report back to the user the time still remaining.
 */
-   if (rmtp && copy_to_user(rmtp, &it.it_value, sizeof *rmtp))
+   if (rmtp && copy_to_user(rmtp, &it.it_value, sizeof(*rmtp)))
return -EFAULT;
 
restart_block->fn = posix_cpu_nsleep_restart;
@@ -1400,7 +1400,7 @@ static long posix_cpu_nsleep_restart(struct restart_block 
*restart_block)
/*
 * Report back to the user the time still remaining.
 */
-   if (rmtp && copy_to_user(rmtp, &it.it_value, sizeof *rmtp))
+   if (rmtp && copy_to_user(rmtp, &it.it_value, sizeof(*rmtp)))
return -EFAULT;
 
restart_block->nanosleep.expires = timespec_to_ns(&t);
-- 
1.9.1

[PATCH] ARM: dts: keystone-k2*: Increase SPI Flash partition size for U-Boot

2016-05-25 Thread Vignesh R

U-Boot SPI Boot image is now more than 512KB for Keystone2 devices and
cannot fit into existing partition. So, increase the SPI Flash partition
for U-Boot to 1MB for all Keystone2 devices.

Signed-off-by: Vignesh R 
---
 arch/arm/boot/dts/keystone-k2e-evm.dts  | 4 ++--
 arch/arm/boot/dts/keystone-k2hk-evm.dts | 4 ++--
 arch/arm/boot/dts/keystone-k2l-evm.dts  | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/boot/dts/keystone-k2e-evm.dts 
b/arch/arm/boot/dts/keystone-k2e-evm.dts
index 4c32ebc1425a..39c91185e112 100644
--- a/arch/arm/boot/dts/keystone-k2e-evm.dts
+++ b/arch/arm/boot/dts/keystone-k2e-evm.dts
@@ -129,13 +129,13 @@
 
partition@0 {
label = "u-boot-spl";
-   reg = <0x0 0x8>;
+   reg = <0x0 0x10>;
read-only;
};
 
partition@1 {
label = "misc";
-   reg = <0x8 0xf8>;
+   reg = <0x10 0xf0>;
};
};
 };
diff --git a/arch/arm/boot/dts/keystone-k2hk-evm.dts 
b/arch/arm/boot/dts/keystone-k2hk-evm.dts
index b38b3441818b..afc70d0481d4 100644
--- a/arch/arm/boot/dts/keystone-k2hk-evm.dts
+++ b/arch/arm/boot/dts/keystone-k2hk-evm.dts
@@ -157,13 +157,13 @@
 
partition@0 {
label = "u-boot-spl";
-   reg = <0x0 0x8>;
+   reg = <0x0 0x10>;
read-only;
};
 
partition@1 {
label = "misc";
-   reg = <0x8 0xf8>;
+   reg = <0x10 0xf0>;
};
};
 };
diff --git a/arch/arm/boot/dts/keystone-k2l-evm.dts 
b/arch/arm/boot/dts/keystone-k2l-evm.dts
index 7f9c2e94d605..cabbdf69ddf5 100644
--- a/arch/arm/boot/dts/keystone-k2l-evm.dts
+++ b/arch/arm/boot/dts/keystone-k2l-evm.dts
@@ -106,13 +106,13 @@
 
partition@0 {
label = "u-boot-spl";
-   reg = <0x0 0x8>;
+   reg = <0x0 0x10>;
read-only;
};
 
partition@1 {
label = "misc";
-   reg = <0x8 0xf8>;
+   reg = <0x10 0xf0>;
};
};
 };
-- 
2.8.3

Re: v4.6 kernel BUG at mm/rmap.c:1101!

2016-05-25 Thread Mika Westerberg

On Mon, May 23, 2016 at 05:18:55PM +0200, Andrea Arcangeli wrote:
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 8a839935b18c..0ea5d9071b32 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1098,6 +1098,8 @@ void page_move_anon_rmap(struct page *page,
> >  
> > VM_BUG_ON_PAGE(!PageLocked(page), page);
> > VM_BUG_ON_VMA(!anon_vma, vma);
> > +   if (IS_ENABLED(CONFIG_DEBUG_VM) && PageTransHuge(page))
> > +   address &= HPAGE_PMD_MASK;
> > VM_BUG_ON_PAGE(page->index != linear_page_index(vma, address), page);
> >  
> > anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
> 
> Reviewed-by: Andrea Arcangeli 

My desktop survived overnight without crash so I guess this is

Tested-by: Mika Westerberg 

Thanks.

Re: [PATCH 09/16] sched/fair: Let asymmetric cpu configurations balance at wake-up

2016-05-25 Thread Morten Rasmussen

On Wed, May 25, 2016 at 02:57:00PM +0800, Wanpeng Li wrote:
> 2016-05-23 18:58 GMT+08:00 Morten Rasmussen :
> > Currently, SD_WAKE_AFFINE always takes priority over wakeup balancing if
> > SD_BALANCE_WAKE is set on the sched_domains. For asymmetric
> > configurations SD_WAKE_AFFINE is only desirable if the waking task's
> > compute demand (utilization) is suitable for the cpu capacities
> > available within the SD_WAKE_AFFINE sched_domain. If not, let wakeup
> > balancing take over (find_idlest_{group, cpu}()).
> >
> > The assumption is that SD_WAKE_AFFINE is never set for a sched_domain
> > containing cpus with different capacities. This is enforced by a
> > previous patch based on the SD_ASYM_CPUCAPACITY flag.
> >
> > Ideally, we shouldn't set 'want_affine' in the first place, but we don't
> > know if SD_BALANCE_WAKE is enabled on the sched_domain(s) until we start
> > traversing them.
> >
> > cc: Ingo Molnar 
> > cc: Peter Zijlstra 
> >
> > Signed-off-by: Morten Rasmussen 
> > ---
> >  kernel/sched/fair.c | 28 +++-
> >  1 file changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 564215d..ce44fa7 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -114,6 +114,12 @@ unsigned int __read_mostly sysctl_sched_shares_window 
> > = 1000UL;
> >  unsigned int sysctl_sched_cfs_bandwidth_slice = 5000UL;
> >  #endif
> >
> > +/*
> > + * The margin used when comparing utilization with cpu capacity:
> > + * util * 1024 < capacity * margin
> > + */
> > +unsigned int capacity_margin = 1280; /* ~20% */
> > +
> >  static inline void update_load_add(struct load_weight *lw, unsigned long 
> > inc)
> >  {
> > lw->weight += inc;
> > @@ -5293,6 +5299,25 @@ static int cpu_util(int cpu)
> > return (util >= capacity) ? capacity : util;
> >  }
> >
> > +static inline int task_util(struct task_struct *p)
> > +{
> > +   return p->se.avg.util_avg;
> > +}
> > +
> > +static int wake_cap(struct task_struct *p, int cpu, int prev_cpu)
> > +{
> > +   long delta;
> > +   long prev_cap = capacity_of(prev_cpu);
> > +
> > +   delta = cpu_rq(cpu)->rd->max_cpu_capacity - prev_cap;
> > +
> > +   /* prev_cpu is fairly close to max, no need to abort wake_affine */
> > +   if (delta < prev_cap >> 3)
> > +   return 0;
> > +
> > +   return prev_cap * 1024 < task_util(p) * capacity_margin;
> > +}
> 
> If one task util_avg is SCHED_CAPACITY_SCALE and running on x86 box w/
> SMT enabled, then each HT has capacity 589, wake_cap() will result in
> always not wake affine, right?

The idea is that SMT systems would bail out already at the previous
condition. We should have max_cpu_capacity == prev_cap == 589, delta
should then be zero and make the first condition true and make
wake_cap() always return 0 for any system with symmetric capacities
regardless of their actual capacity values.

Note that this isn't entirely true as I used capacity_of() for prev_cap,
if I change that to capacity_orig_of() it should be true.

By making the !wake_cap() condition always true for want_affine, we
should preserve existing behaviour for SMT/SMP. The only overhead is the
capacity delta computation and comparison, which should be cheap.

Does that make sense?

Btw, task util_avg == SCHED_CAPACITY_SCALE should only be possible
temporarily, it should decay to util_avg <=
capacity_orig_of(task_cpu(p)) over time. That doesn't affect your
question though as the second condition would still evaluate true if
util_avg == capacity_orig_of(task_cpu(p)), but as said above the first
condition should bail out before we get here.

Morten

> > +
> >  /*
> >   * select_task_rq_fair: Select target runqueue for the waking task in 
> > domains
> >   * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE,
> > @@ -5316,7 +5341,8 @@ select_task_rq_fair(struct task_struct *p, int 
> > prev_cpu, int sd_flag, int wake_f
> >
> > if (sd_flag & SD_BALANCE_WAKE) {
> > record_wakee(p);
> > -   want_affine = !wake_wide(p) && cpumask_test_cpu(cpu, 
> > tsk_cpus_allowed(p));
> > +   want_affine = !wake_wide(p) && !wake_cap(p, cpu, prev_cpu)
> > + && cpumask_test_cpu(cpu, tsk_cpus_allowed(p));
> > }
> >
> > rcu_read_lock();
> > --
> > 1.9.1
> >

Re: [PATCH 5/7] x86/uaccess: Warn on uaccess faults other than #PF

2016-05-25 Thread Borislav Petkov

On Tue, May 24, 2016 at 03:48:42PM -0700, Andy Lutomirski wrote:
> If a uaccess instruction fails due to an8 error other than #PF,
> warn.  If the fault is #GP, it most likely indicates access to a
> non-canonical address, which means that an access_ok check is
> missing, and that's bad.  If the fault is something else (#UD?),
> then something is very wrong and we should diagnose it rather
> than ignoring it.
> 
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/mm/extable.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
> index 658292fdee5e..c1933471fce7 100644
> --- a/arch/x86/mm/extable.c
> +++ b/arch/x86/mm/extable.c
> @@ -29,6 +29,19 @@ EXPORT_SYMBOL(ex_handler_default);
>  static bool uaccess_fault_okay(int trapnr, unsigned long error_code,
>  unsigned long extra)
>  {
> + /*
> +  * For uaccess, only page faults should be fixed up.  I can't see
> +  * any exploit mitigation value in OOPSing on other types of faults,
> +  * so just warn and continue if that happens.  This means that
> +  * uaccess faults to non-canonical addresses will warn.  That's okay
> +  * -- this will only happen if an access_ok is missing, and we want to
> +  * detect that error if it happens.
> +  */
> + if (WARN_ONCE(trapnr != X86_TRAP_PF,
> +   "unexpected uaccess trap %d (may indicate a missing 
> access_ok on a non-canonical address)\n",
> +   trapnr))

Perhaps dump also regs->ip and make the warn message more helpful...

> + return true;  /* no good reason to OOPS. */

You love those side comments, don'tcha? :-)

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

1 2 3 4 5 6 7 8 >

1 - 100 of 748 matches

Mail list logo