date:20160205

[PATCH] jiffies: use CLOSKOURCE_MASK instead of constant

2016-02-05 Thread Alexander Kuleshov

The CLOCKSOURCE_MASK(32) macro expands to the same value, but
makes code more readable.

Signed-off-by: Alexander Kuleshov 
---
 kernel/time/jiffies.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index 347fecf..555e21f 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -68,7 +68,7 @@ static struct clocksource clocksource_jiffies = {
.name   = "jiffies",
.rating = 1, /* lowest valid rating*/
.read   = jiffies_read,
-   .mask   = 0x, /*32bits*/
+   .mask   = CLOCKSOURCE_MASK(32),
.mult   = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
.shift  = JIFFIES_SHIFT,
.max_cycles = 10,
-- 
2.7.0.25.gfc10eb5

Re: [PATCH] crypto: testmgr: mark more algorithms as FIPS compliant

2016-02-05 Thread Herbert Xu

On Fri, Feb 05, 2016 at 02:23:33PM +0100, Marcus Meissner wrote:
> Some more authenc() wrapped algorithms are FIPS compliant, tag
> them as such.
> 
> Signed-off-by: Marcus Meissner 

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 0/2] crypto: atmel-sha - fix resource management

2016-02-05 Thread Herbert Xu

On Fri, Feb 05, 2016 at 01:45:11PM +0100, Cyrille Pitchen wrote:
> Hi all,
> 
> these two small patches fix resource release and clock management in
> atomic context.

Applied to crypto.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH] eCryptfs: fix typos in comment

2016-02-05 Thread Wei Yuan

Signed-off-by: Weiyuan 
---
 fs/ecryptfs/crypto.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
index 80d6901..d47a7d4 100644
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -44,7 +44,7 @@
  * ecryptfs_to_hex
  * @dst: Buffer to take hex character representation of contents of
  *   src; must be at least of size (src_size * 2)
- * @src: Buffer to be converted to a hex string respresentation
+ * @src: Buffer to be converted to a hex string representation
  * @src_size: number of bytes to convert
  */
 void ecryptfs_to_hex(char *dst, char *src, size_t src_size)
@@ -59,7 +59,7 @@ void ecryptfs_to_hex(char *dst, char *src, size_t src_size)
  * ecryptfs_from_hex
  * @dst: Buffer to take the bytes from src hex; must be at least of
  *   size (src_size / 2)
- * @src: Buffer to be converted from a hex string respresentation to raw value
+ * @src: Buffer to be converted from a hex string representation to raw value
  * @dst_size: size of dst buffer, or number of hex characters pairs to convert
  */
 void ecryptfs_from_hex(char *dst, char *src, int dst_size)
-- 
2.1.0

Re: [PATCH] crypto: mark authenticated ctr(aes) also as FIPS able

2016-02-05 Thread Herbert Xu

On Thu, Feb 04, 2016 at 03:30:26PM +0100, Marcus Meissner wrote:
> Signed-off-by: Marcus Meissner 

This doesn't compile for me.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v2] fix out of bound read in __test_aead()

2016-02-05 Thread Herbert Xu

On Wed, Feb 03, 2016 at 01:58:12PM +0100, Jerome Marchand wrote:
> __test_aead() reads MAX_IVLEN bytes from template[i].iv, but the
> actual length of the initialisation vector can be shorter.
> The length of the IV is already calculated earlier in the
> function. Let's just reuses that. Also the IV length is currently
> calculated several time for no reason. Let's fix that too.
> This fix an out-of-bound error detected by KASan.
> 
> Signed-off-by: Jerome Marchand 

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v3 4/4] crypto: testmgr - Add a test case for import()/export()

2016-02-05 Thread Herbert Xu

On Wed, Feb 03, 2016 at 06:26:57PM +0800, Rui Wang wrote:
> Modify __test_hash() so that hash import/export can be tested
> from within the kernel. The test is unconditionally done when
> a struct hash_testvec has its .np > 1.
> 
> v3: make the test unconditional
> v2: Leverage template[i].np as suggested by Tim Chen
> 
> Signed-off-by: Rui Wang 

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v5 0/3] crypto: KEYS: convert public key to akcipher api

2016-02-05 Thread Herbert Xu

On Tue, Feb 02, 2016 at 10:08:48AM -0800, Tadeusz Struk wrote:
> Resend v5 rebased on top of 4.5
> 
> This patch set converts the module verification and digital signature
> code to the new akcipher API.
> RSA implementation has been removed from crypto/asymmetric_keys and the
> new API is used for cryptographic primitives.
> There is no need for MPI above the akcipher API anymore.
> Modules can be verified with software as well as HW RSA implementations.
> 
> Patches generated against cryptodev-2.6

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH 02/11] crypto: sunxi-ss: prevent compilation on 64-bit

2016-02-05 Thread Herbert Xu

On Mon, Feb 01, 2016 at 05:39:21PM +, Andre Przywara wrote:
> The driver for the sunxi-ss crypto engine is not entirely 64-bit safe,
> compilation on arm64 spits some warnings.
> The proper fix was deemed to involved [1], so since 64-bit SoCs won't
> have this IP block we just disable this driver for 64-bit.
> 
> [1]: 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/399988.html
>  (and the reply)
> 
> Signed-off-by: Andre Przywara 

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v2 1/4] crypto x86/sha1_mb: Fix load failure

2016-02-05 Thread Herbert Xu

On Tue, Feb 02, 2016 at 09:56:45PM +0800, Rui Wang wrote:
>
> >From 4bcb73adbef99aada94c49f352063619aa24d43d Mon Sep 17 00:00:00 2001
> From: Rui Wang 
> Date: Mon, 14 Dec 2015 17:22:13 +0800
> Subject: [PATCH v2 1/4] crypto x86/sha1_mb: Fix load failure
> 
> modprobe sha1_mb fails with the following message:
> 
> modprobe: ERROR: could not insert 'sha1_mb': No such device
> 
> It is because it needs to set its statesize and implement its
> import() and export() interface.
> 
> v2: remove redundant call to crypto_shash_init()
> 
> Signed-off-by: Rui Wang 

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH v2 1/2] crypto: aead - move aead_request_cast helper to aead.h

2016-02-05 Thread Herbert Xu

On Mon, Feb 01, 2016 at 11:17:30AM -0800, Tadeusz Struk wrote:
> Move the helper function to common header for everybody to use.
> 
> changes in v2:
> - move the helper to crypto/internal/aead.h
>   instead of crypto/aead.h
> 
> Signed-off-by: Tadeusz Struk 

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH] clocksource: introduce clocksource_freq2mult()

2016-02-05 Thread Alexander Kuleshov

The clocksource_khz2mult() and clocksource_hz2mult() share similar
code wihch calculates a mult from the given frequency. Both implementations
in differ only in value of a frequency. This patch introduces the
clocksource_freq2mult() helper with generic implementation of
mult calculation to prevent code duplication.

Signed-off-by: Alexander Kuleshov 
---
 include/linux/clocksource.h | 45 +++--
 1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 6013021..a307bf6 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -118,6 +118,23 @@ struct clocksource {
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) : -1)
 
+static inline u32 clocksource_freq2mult(u32 freq, u32 shift_constant, u64 from)
+{
+   /*  freq = cyc/from
+*  mult/2^shift  = ns/cyc
+*  mult = ns/cyc * 2^shift
+*  mult = from/freq * 2^shift
+*  mult = from * 2^shift / freq
+*  mult = (from<

[PATCH] audit: Fix typo in comment

2016-02-05 Thread Wei Yuan

Signed-off-by: Weiyuan 
---
 kernel/audit_watch.c | 2 +-
 kernel/auditfilter.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
index 9f194aa..3cf1c59 100644
--- a/kernel/audit_watch.c
+++ b/kernel/audit_watch.c
@@ -185,7 +185,7 @@ static struct audit_watch *audit_init_watch(char *path)
return watch;
 }
 
-/* Translate a watch string to kernel respresentation. */
+/* Translate a watch string to kernel representation. */
 int audit_to_watch(struct audit_krule *krule, char *path, int len, u32 op)
 {
struct audit_watch *watch;
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index b8ff9e1..94ca7b1 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -158,7 +158,7 @@ char *audit_unpack_string(void **bufp, size_t *remain, 
size_t len)
return str;
 }
 
-/* Translate an inode field to kernel respresentation. */
+/* Translate an inode field to kernel representation. */
 static inline int audit_to_inode(struct audit_krule *krule,
 struct audit_field *f)
 {
@@ -415,7 +415,7 @@ static int audit_field_valid(struct audit_entry *entry, 
struct audit_field *f)
return 0;
 }
 
-/* Translate struct audit_rule_data to kernel's rule respresentation. */
+/* Translate struct audit_rule_data to kernel's rule representation. */
 static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
   size_t datasz)
 {
@@ -593,7 +593,7 @@ static inline size_t audit_pack_string(void **bufp, const 
char *str)
return len;
 }
 
-/* Translate kernel rule respresentation to struct audit_rule_data. */
+/* Translate kernel rule representation to struct audit_rule_data. */
 static struct audit_rule_data *audit_krule_to_data(struct audit_krule *krule)
 {
struct audit_rule_data *data;
-- 
2.1.0

[PATCH 1/2] staging/lustre/libcfs: Properly handle debugfs read- and write-only files

2016-02-05 Thread green

From: Oleg Drokin 

It turns out that unlike procfs, debugfs does not really enforce
permissions for root (similar to regular filesystems), so we need
to ensure we are not providing ->write() method to read-only files
and ->read() method for write-only files at registration.

This fixes a couple of crashes on unexpected access.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/libcfs/module.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/libcfs/module.c 
b/drivers/staging/lustre/lustre/libcfs/module.c
index d5a047b..611607a 100644
--- a/drivers/staging/lustre/lustre/libcfs/module.c
+++ b/drivers/staging/lustre/lustre/libcfs/module.c
@@ -502,13 +502,36 @@ static ssize_t lnet_debugfs_write(struct file *filp, 
const char __user *buf,
return error;
 }
 
-static const struct file_operations lnet_debugfs_file_operations = {
+static const struct file_operations lnet_debugfs_file_operations_rw = {
.open   = simple_open,
.read   = lnet_debugfs_read,
.write  = lnet_debugfs_write,
.llseek = default_llseek,
 };
 
+static const struct file_operations lnet_debugfs_file_operations_ro = {
+   .open   = simple_open,
+   .read   = lnet_debugfs_read,
+   .llseek = default_llseek,
+};
+
+static const struct file_operations lnet_debugfs_file_operations_wo = {
+   .open   = simple_open,
+   .write  = lnet_debugfs_write,
+   .llseek = default_llseek,
+};
+
+static const struct file_operations *lnet_debugfs_fops_select(umode_t mode)
+{
+   if (!(mode & S_IWUGO))
+   return _debugfs_file_operations_ro;
+
+   if (!(mode & S_IRUGO))
+   return _debugfs_file_operations_wo;
+
+   return _debugfs_file_operations_rw;
+}
+
 void lustre_insert_debugfs(struct ctl_table *table,
   const struct lnet_debugfs_symlink_def *symlinks)
 {
@@ -525,7 +548,7 @@ void lustre_insert_debugfs(struct ctl_table *table,
for (; table->procname; table++)
debugfs_create_file(table->procname, table->mode,
lnet_debugfs_root, table,
-   _debugfs_file_operations);
+   lnet_debugfs_fops_select(table->mode));
 
for (; symlinks && symlinks->name; symlinks++)
debugfs_create_symlink(symlinks->name, lnet_debugfs_root,
-- 
2.1.0

[PATCH 0/2] Lustre debugfs fixes

2016-02-05 Thread green

From: Oleg Drokin 

These two patches tie some loose ends from the Lustre debugfs conversion,
but while investigating them I also accumulated some questions
that would be good to get answers for.

1. Unlike procfs, debugfs does not really guard your back and if root
comes in and tries to write to a readonly file (or read a write-only one),
it's allowed (as are permission changes too) as long as the appropriate write
(or read) method is provided.
So apparently there's whole class of bugs related to this, sample
exhibits are in e.g. acpi_ec_add_debugfs creating a totally noop module
parameter to control writes that does not really prevent any writes
(patch submitted separately).
But also things like wil_debugfs_create_iomem_x32 where when called from
e.g. wil6210_debugfs_init_offset, some read-only attributed get a generic
write method that would write straight to hardware registers (who knows
what would happen when you write there, possibly they are readonly, but
you are not getting an error).
At first it looked like an easy way to catch this would be to just check
for RO/WO mode with write/read handler set, but this is thwarted by
the simple attribute defines that always assign read and write methods,
but do the check internally for the get/set method instead.
But also some fault injection code that sets readonly access on some files,
but provides a fully functional write method that works as desired.

Would it make sense to redo the simple-attribute framework to easy such
cases detection (and also update writeable attributes to have permissions
reflecting this) and have a correspinding kernel debug compile option
to check for these?

2. I noticed we exported some of the presumably GPL-only debugfs
functionality with plain EXPORT_SYMBOL, so the second patch rectifies this.
Now, I also see that drm_debugfs_create_files allows anybody to
insert any debugfs file anywhere and it is a non-gpl EXPORT_SYMBOL as well,
should it be converted too, or is it sysfs access only that is restricted?

Oleg Drokin (2):
  staging/lustre/libcfs: Properly handle debugfs read- and write-only
files
  staging/lustre/obdclass: export debugfs functionality for GPL only.

 drivers/staging/lustre/lustre/libcfs/module.c  | 27 --
 .../lustre/lustre/obdclass/lprocfs_status.c| 18 +++
 2 files changed, 34 insertions(+), 11 deletions(-)

-- 
2.1.0

[PATCH 2/2] staging/lustre/obdclass: export debugfs functionality for GPL only.

2016-02-05 Thread green

From: Oleg Drokin 

Turns out we mistakenly export some pretty-wide-reaching debugfs
functions as EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL as we should,
so this patch rectifies the situation.

Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/lustre/obdclass/lprocfs_status.c| 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c 
b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index b65ad93..9a1434d 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -261,7 +261,7 @@ struct dentry *ldebugfs_add_simple(struct dentry *root,
}
return entry;
 }
-EXPORT_SYMBOL(ldebugfs_add_simple);
+EXPORT_SYMBOL_GPL(ldebugfs_add_simple);
 
 static struct file_operations lprocfs_generic_fops = { };
 
@@ -294,14 +294,14 @@ int ldebugfs_add_vars(struct dentry *parent,
}
return 0;
 }
-EXPORT_SYMBOL(ldebugfs_add_vars);
+EXPORT_SYMBOL_GPL(ldebugfs_add_vars);
 
 void ldebugfs_remove(struct dentry **entryp)
 {
debugfs_remove_recursive(*entryp);
*entryp = NULL;
 }
-EXPORT_SYMBOL(ldebugfs_remove);
+EXPORT_SYMBOL_GPL(ldebugfs_remove);
 
 struct dentry *ldebugfs_register(const char *name,
 struct dentry *parent,
@@ -327,7 +327,7 @@ struct dentry *ldebugfs_register(const char *name,
 out:
return entry;
 }
-EXPORT_SYMBOL(ldebugfs_register);
+EXPORT_SYMBOL_GPL(ldebugfs_register);
 
 /* Generic callbacks */
 int lprocfs_rd_uint(struct seq_file *m, void *data)
@@ -942,7 +942,7 @@ int lprocfs_obd_setup(struct obd_device *obd, struct 
lprocfs_vars *list,
 
return rc;
 }
-EXPORT_SYMBOL(lprocfs_obd_setup);
+EXPORT_SYMBOL_GPL(lprocfs_obd_setup);
 
 int lprocfs_obd_cleanup(struct obd_device *obd)
 {
@@ -957,7 +957,7 @@ int lprocfs_obd_cleanup(struct obd_device *obd)
 
return 0;
 }
-EXPORT_SYMBOL(lprocfs_obd_cleanup);
+EXPORT_SYMBOL_GPL(lprocfs_obd_cleanup);
 
 int lprocfs_stats_alloc_one(struct lprocfs_stats *stats, unsigned int cpuid)
 {
@@ -1219,7 +1219,7 @@ int ldebugfs_register_stats(struct dentry *parent, const 
char *name,
 
return 0;
 }
-EXPORT_SYMBOL(ldebugfs_register_stats);
+EXPORT_SYMBOL_GPL(ldebugfs_register_stats);
 
 void lprocfs_counter_init(struct lprocfs_stats *stats, int index,
  unsigned conf, const char *name, const char *units)
@@ -1446,7 +1446,7 @@ int ldebugfs_seq_create(struct dentry *parent,
 
return 0;
 }
-EXPORT_SYMBOL(ldebugfs_seq_create);
+EXPORT_SYMBOL_GPL(ldebugfs_seq_create);
 
 int ldebugfs_obd_seq_create(struct obd_device *dev,
const char *name,
@@ -1457,7 +1457,7 @@ int ldebugfs_obd_seq_create(struct obd_device *dev,
return ldebugfs_seq_create(dev->obd_debugfs_entry, name,
   mode, seq_fops, data);
 }
-EXPORT_SYMBOL(ldebugfs_obd_seq_create);
+EXPORT_SYMBOL_GPL(ldebugfs_obd_seq_create);
 
 void lprocfs_oh_tally(struct obd_histogram *oh, unsigned int value)
 {
-- 
2.1.0

[PATCH] staging/lustre/lnet: Don't call roundup_pow_of_two on zero in LNetEQAlloc

2016-02-05 Thread green

From: Oleg Drokin 

roundup_pow_of_two return when called on a zero argument is
undefined, so don't call it like that.

This fixes a problem introduced by commit 322489d9d551
("staging/lustre: Use roundup_pow_of_two() in LNetEQAlloc()")
since 0 is a valid count parameter for LNetEQAlloc. Also manifesting
itself as an annoying kernel warning:
LNet: 3486:0:(lib-eq.c:85:LNetEQAlloc()) EQ callback is guaranteed to get every 
event, do you still want to set eqcount 1 for polling event which will have 
locking overhead? Please contact with developer to confirm

Signed-off-by: Oleg Drokin 
CC: Pekka Enberg 
---
 drivers/staging/lustre/lnet/lnet/lib-eq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/lnet/lib-eq.c 
b/drivers/staging/lustre/lnet/lnet/lib-eq.c
index 64f94a6..bfbc313 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-eq.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-eq.c
@@ -79,7 +79,8 @@ LNetEQAlloc(unsigned int count, lnet_eq_handler_t callback,
 * overflow, they don't skip entries, so the queue has the same
 * apparent capacity at all times */
 
-   count = roundup_pow_of_two(count);
+   if (count)
+   count = roundup_pow_of_two(count);
 
if (callback != LNET_EQ_HANDLER_NONE && count != 0)
CWARN("EQ callback is guaranteed to get every event, do you 
still want to set eqcount %d for polling event which will have locking 
overhead? Please contact with developer to confirm\n", count);
-- 
2.1.0

Re: [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-05 Thread Andy Lutomirski

On Feb 5, 2016 8:30 PM, "Luis R. Rodriguez"  wrote:
>
> paravirt_enabled conveys the idea that if this is set or if
> paravirt_enabled() returns true you are in a paravirtualized
> environment. This is not true by any means, and left as-is
> is just causing confusion and is prone to be misused and abused.
>
> This primitive is really only useful to determine if you have a
> paravirtualization hypervisor that supports legacy paravirtualized
> guests. At run time, this tells us if we've booted into a Linux guest
> with support for legacy devices and features.
>
> To avoid further issues with semantics on this we loosely borrow
> the definition of "legacy" from both the ACPI 5.2.9.3 "IA-PC Boot
> Architecture Flags" section and the PC 2001 definition in the PC
> Systems design guide [0]:
>
>   paravirt_legacy() is true if this hypervisor supports legacy
> x86 paravirtualized guests.

This needs to be far more concrete.  I'm reasonably well versed in x86
details relevant to kernels ans I have *no clue* what your semantics
mean.

> +/**
> + * struct pv_info - paravirt hypervisor information
> + *
> + * @supports_x86_legacy: true if this hypervisor supports legacy x86
> + * paravirtualized guests.  The definition of legacy here adheres
> + * *loosely* to both the notion of legacy in the ACPI 5.2.9.3 "IA-PC Boot
> + * Architecture Flags" section and the PC 2001 "legacy free" concept [1]
> + * referred to in the PC System Design Guide [2] [3] on Chapter 3, Page 
> 50
> + * [4].  Legacy x86 guests systems are guest systems which are not 
> "legacy
> + * free" as per the PC 2001 definition, and in the ACPI sense could have
> + * any of the legacy ACPI IA-PC Boot architecture flags set. These are 
> x86
> + * systems with any type of legacy peripherals or requirements.
> + *
> + * Examples of some popular legacy peripherals:
> + *
> + *   a) Floppy drive
> + *   b) Legacy ports [1] such as such as parallel ports, PS/2 connectors,
> + *  serial ports / RS-232, game ports Parallel ATA, and IEEE 1394
> + *   c) ISA bus
> + *
> + * Examples of features required to support such type of legacy guests
> + * are the need for APM and a PNP BIOS.

Seriously?  I think you just defined every standard native x86 system
as well as QEMU/KVM as "legacy".

Can we just enumerate this crap?  I propose:

Xen PV and lguest are paravirt_legacy.  Nothing else is
paravirt_legacy.  The addition of new paravirt_legacy support is
strongly discouraged.

--Andy

[PATCH] acpi/ec: Deny write access unless requested by module param

2016-02-05 Thread green

From: Oleg Drokin 

In debugfs it's not enough to just set file mode to read-only to
deny write access to a file, instead just don't provide
the write method unless write access is really requested.

Signed-off-by: Oleg Drokin 
---
I assume allowing run-time changes via /sys/module is preferrable,
opposed to forced module unload and reload to change this option,
but I can submit another patch to only depend on the module parameter
too, please let me know.

 drivers/acpi/ec_sys.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/acpi/ec_sys.c b/drivers/acpi/ec_sys.c
index bea8e42..6c7dd7a 100644
--- a/drivers/acpi/ec_sys.c
+++ b/drivers/acpi/ec_sys.c
@@ -73,6 +73,9 @@ static ssize_t acpi_ec_write_io(struct file *f, const char 
__user *buf,
loff_t init_off = *off;
int err = 0;
 
+   if (!write_support)
+   return -EINVAL;
+
if (*off >= EC_SPACE_SIZE)
return 0;
if (*off + count >= EC_SPACE_SIZE) {
-- 
2.1.0

[PATCH] arm64: defconfig: add spmi and usb related configs

2016-02-05 Thread Srinivas Kandagatla

This patch adds kconfigs for spmi bus support, pinctrl drivers and usb
related to get USB working on Qualcomm DB410C board.

Signed-off-by: Srinivas Kandagatla 
---
 arch/arm64/configs/defconfig | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 8c6f05f..ad26a53 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -144,7 +144,9 @@ CONFIG_I2C_RCAR=y
 CONFIG_SPI=y
 CONFIG_SPI_PL022=y
 CONFIG_SPI_QUP=y
+CONFIG_SPMI=y
 CONFIG_PINCTRL_MSM8916=y
+CONFIG_PINCTRL_QCOM_SPMI_PMIC=y
 CONFIG_GPIO_PL061=y
 CONFIG_GPIO_RCAR=y
 CONFIG_GPIO_XGENE=y
@@ -152,9 +154,11 @@ CONFIG_POWER_RESET_MSM=y
 CONFIG_POWER_RESET_XGENE=y
 CONFIG_POWER_RESET_SYSCON=y
 # CONFIG_HWMON is not set
+CONFIG_MFD_SPMI_PMIC=y
 CONFIG_REGULATOR=y
 CONFIG_REGULATOR_FIXED_VOLTAGE=y
 CONFIG_REGULATOR_QCOM_SMD_RPM=y
+CONFIG_REGULATOR_QCOM_SPMI=y
 CONFIG_FB=y
 CONFIG_FB_ARMCLCD=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
@@ -167,13 +171,21 @@ CONFIG_SND_SOC=y
 CONFIG_SND_SOC_RCAR=y
 CONFIG_SND_SOC_AK4613=y
 CONFIG_USB=y
+CONFIG_USB_OTG=y
 CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_EHCI_MSM=y
 CONFIG_USB_EHCI_HCD_PLATFORM=y
 CONFIG_USB_OHCI_HCD=y
 CONFIG_USB_OHCI_HCD_PLATFORM=y
 CONFIG_USB_STORAGE=y
+CONFIG_USB_CHIPIDEA=y
+CONFIG_USB_CHIPIDEA_UDC=y
+CONFIG_USB_CHIPIDEA_HOST=y
 CONFIG_USB_ISP1760=y
+CONFIG_USB_HSIC_USB3503=y
+CONFIG_USB_MSM_OTG=y
 CONFIG_USB_ULPI=y
+CONFIG_USB_GADGET=y
 CONFIG_MMC=y
 CONFIG_MMC_BLOCK_MINORS=32
 CONFIG_MMC_ARMMMCI=y
@@ -216,6 +228,7 @@ CONFIG_QCOM_SMD_RPM=y
 CONFIG_ARCH_TEGRA_132_SOC=y
 CONFIG_ARCH_TEGRA_210_SOC=y
 CONFIG_HISILICON_IRQ_MBIGEN=y
+CONFIG_EXTCON_USB_GPIO=y
 CONFIG_PHY_XGENE=y
 CONFIG_EXT2_FS=y
 CONFIG_EXT3_FS=y
-- 
1.9.1

Re: [PATCH 3/5] oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space

2016-02-05 Thread Michal Hocko

On Thu 04-02-16 15:43:19, Michal Hocko wrote:
> On Thu 04-02-16 23:22:18, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > From: Michal Hocko 
> > > 
> > > When oom_reaper manages to unmap all the eligible vmas there shouldn't
> > > be much of the freable memory held by the oom victim left anymore so it
> > > makes sense to clear the TIF_MEMDIE flag for the victim and allow the
> > > OOM killer to select another task.
> > 
> > Just a confirmation. Is it safe to clear TIF_MEMDIE without reaching 
> > do_exit()
> > with regard to freezing_slow_path()? Since clearing TIF_MEMDIE from the OOM
> > reaper confuses
> > 
> > wait_event(oom_victims_wait, !atomic_read(_victims));
> > 
> > in oom_killer_disable(), I'm worrying that the freezing operation continues
> > before the OOM victim which escaped the __refrigerator() actually releases
> > memory. Does this cause consistency problem?
> 
> This is a good question! At first sight it seems this is not safe and we
> might need to make the oom_reaper freezable so that it doesn't wake up
> during suspend and interfere. Let me think about that.

OK, I was thinking about it some more and it seems you are right here.
oom_reaper as a kernel thread is not freezable automatically and so it
might interfere after all the processes/kernel threads are considered
frozen. Then it really might shut down TIF_MEMDIE too early and wake out
oom_killer_disable. wait_event_freezable is not sufficient because the
oom_reaper might running while the PM freezer is freezing tasks and it
will miss it because it doesn't see it.

So I think we might need this. I am heading to vacation today and will
be offline for the next week so I will prepare the full patch with the
proper changelog after I get back:

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ca61e6cfae52..7e9953a64489 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -521,6 +521,8 @@ static void oom_reap_task(struct task_struct *tsk)

 static int oom_reaper(void *unused)
 {
+   set_freezable();
+
while (true) {
struct task_struct *tsk = NULL;

-- 
Michal Hocko
SUSE Labs

[PATCH 2/2] f2fs: support revoking atomic written pages

2016-02-05 Thread Chao Yu

f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file

With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.

But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.

So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.

If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c  |   1 +
 fs/f2fs/f2fs.h  |   4 +-
 fs/f2fs/file.c  |   2 +-
 fs/f2fs/recovery.c  |   2 +-
 fs/f2fs/segment.c   | 118 +++-
 fs/f2fs/segment.h   |   1 +
 include/trace/events/f2fs.h |   1 +
 7 files changed, 93 insertions(+), 36 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 5071cf3..d168814 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1073,6 +1073,7 @@ int do_write_data_page(struct f2fs_io_info *fio)
return err;
 
fio->blk_addr = dn.data_blkaddr;
+   fio->old_blkaddr = dn.data_blkaddr;
 
/* This page is already truncated */
if (fio->blk_addr == NULL_ADDR) {
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 111e2e1..4dfa76c 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -686,6 +686,7 @@ enum page_type {
META_FLUSH,
INMEM,  /* the below types are used by tracepoints only. */
INMEM_DROP,
+   INMEM_REVOKE,
IPU,
OPU,
 };
@@ -695,6 +696,7 @@ struct f2fs_io_info {
enum page_type type;/* contains DATA/NODE/META/META_FLUSH */
int rw; /* contains R/RS/W/WS with REQ_META/REQ_PRIO */
block_t blk_addr;   /* block address to be written */
+   block_t old_blkaddr;/* old block address before Cow */
struct page *page;  /* page to be written */
struct page *encrypted_page;/* encrypted page */
 };
@@ -1853,7 +1855,7 @@ void write_node_page(unsigned int, struct f2fs_io_info *);
 void write_data_page(struct dnode_of_data *, struct f2fs_io_info *);
 void rewrite_data_page(struct f2fs_io_info *);
 void f2fs_replace_block(struct f2fs_sb_info *, struct dnode_of_data *,
-   block_t, block_t, unsigned char, bool);
+   block_t, block_t, unsigned char, bool, bool);
 void allocate_data_block(struct f2fs_sb_info *, struct page *,
block_t, block_t *, struct f2fs_summary *, int);
 void f2fs_wait_on_page_writeback(struct page *, enum page_type, bool);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 2fa055e..2b7305b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -883,7 +883,7 @@ static int __exchange_data_block(struct inode *inode, 
pgoff_t src,
 
get_node_info(sbi, dn.nid, );
f2fs_replace_block(sbi, , dn.data_blkaddr, new_addr,
-   ni.version, true);
+   ni.version, true, false);
f2fs_put_dnode();
} else {
struct page *psrc, *pdst;
diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 5045dd6..0b30cd2 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -465,7 +465,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
 
/* write dummy data page */
f2fs_replace_block(sbi, , src, dest,
-   ni.version, false);
+   ni.version, false, false);
recovered++;
}
}
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 3d09d63..3f551c5 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -191,24 +191,48 @@ void register_inmem_page(struct inode *inode, struct page 
*page)
trace_f2fs_register_inmem_page(page, INMEM);
 }
 
-static void __revoke_inmem_pages(struct inode *inode,
-   struct list_head *head)
+static int __revoke_inmem_pages(struct inode *inode,
+   struct list_head *head,

Re: [PATCH V3 5/5] rtc: max77686: move initialisation of rtc regmap, irq chip locally

2016-02-05 Thread Laxman Dewangan


Hi Javier,
'
On Saturday 06 February 2016 11:00 AM, Javier Martinez Canillas wrote:

Hello Laxman,

Sorry for not doing this before but today was a busy one.



Thanks for testing.



On 02/05/2016 11:37 AM, Laxman Dewangan wrote:

Hi Krzysztof, Javier,




3. Extension of 2
Do regmap_add_irq_chip(), call  regmap_irq_get_virq() for creating 
irq mapping but dont do any interrupt registration i.e. comment 
request_threaded_irq() and hence free_irq().

Then do unbind/bind and then suspend.
   To make sure that this happen when only we have client registered 
interrupt or with mapping also.




This fails, so the problem seems to be with the mapping.

So I tried another scenario:

4. Call regmap_del_irq_chip() just after regmap_irq_get_virq() and try 
to S2R

   without doing any unbind before.

   To test if this is a general issue with regmap_del_irq_chip() after 
doing

   the IRQ mapping and not something specific to the remove callback.

The machine failed to boot. So now at least we have narrowed down the 
issue.


I've looked at both regmap_irq_get_virq() and regmap_del_irq_chip() but I
couldn't find any obvious cause for the issue we are seeing. But it's 
late
Friday so probably I should just stop here and take a fresh look on 
Monday.




So the issue is that when we create mapping, we can not delete the irq_chip.

I saw one function from irq framework irq_dispose_mapping(unsigned int 
virq).


So we need to dispose the mapping before deleting irq chip.


Becasue it is reproduced in normal boot also if we do create mapping and 
delete the irq chip data, I will also be able to validate if I get some 
time on weekend.

[PATCH 1/2] f2fs: split drop_inmem_pages from commit_inmem_pages

2016-02-05 Thread Chao Yu

Split drop_inmem_pages from commit_inmem_pages for code readability,
and prepare for the following modification.

Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h|   3 +-
 fs/f2fs/file.c|   6 ++--
 fs/f2fs/inode.c   |   2 +-
 fs/f2fs/segment.c | 103 +-
 fs/f2fs/super.c   |   2 +-
 5 files changed, 70 insertions(+), 46 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index f6a841b..111e2e1 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1830,7 +1830,8 @@ void destroy_node_manager_caches(void);
  * segment.c
  */
 void register_inmem_page(struct inode *, struct page *);
-int commit_inmem_pages(struct inode *, bool);
+void drop_inmem_pages(struct inode *);
+int commit_inmem_pages(struct inode *);
 void f2fs_balance_fs(struct f2fs_sb_info *, bool);
 void f2fs_balance_fs_bg(struct f2fs_sb_info *);
 int f2fs_issue_flush(struct f2fs_sb_info *);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 3ffb6c1..2fa055e 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1252,7 +1252,7 @@ static int f2fs_release_file(struct inode *inode, struct 
file *filp)
 {
/* some remained atomic pages should discarded */
if (f2fs_is_atomic_file(inode))
-   commit_inmem_pages(inode, true);
+   drop_inmem_pages(inode);
if (f2fs_is_volatile_file(inode)) {
set_inode_flag(F2FS_I(inode), FI_DROP_CACHE);
filemap_fdatawrite(inode->i_mapping);
@@ -1376,7 +1376,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 
if (f2fs_is_atomic_file(inode)) {
clear_inode_flag(F2FS_I(inode), FI_ATOMIC_FILE);
-   ret = commit_inmem_pages(inode, false);
+   ret = commit_inmem_pages(inode);
if (ret) {
set_inode_flag(F2FS_I(inode), FI_ATOMIC_FILE);
goto err_out;
@@ -1439,7 +1439,7 @@ static int f2fs_ioc_abort_volatile_write(struct file 
*filp)
 
if (f2fs_is_atomic_file(inode)) {
clear_inode_flag(F2FS_I(inode), FI_ATOMIC_FILE);
-   commit_inmem_pages(inode, true);
+   drop_inmem_pages(inode);
}
if (f2fs_is_volatile_file(inode)) {
clear_inode_flag(F2FS_I(inode), FI_VOLATILE_FILE);
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 60e3b30..d447707 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -324,7 +324,7 @@ void f2fs_evict_inode(struct inode *inode)
 
/* some remained atomic pages should discarded */
if (f2fs_is_atomic_file(inode))
-   commit_inmem_pages(inode, true);
+   drop_inmem_pages(inode);
 
trace_f2fs_evict_inode(inode);
truncate_inode_pages_final(>i_data);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 57a5f7b..3d09d63 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -191,56 +191,67 @@ void register_inmem_page(struct inode *inode, struct page 
*page)
trace_f2fs_register_inmem_page(page, INMEM);
 }
 
-int commit_inmem_pages(struct inode *inode, bool abort)
+static void __revoke_inmem_pages(struct inode *inode,
+   struct list_head *head)
+{
+   struct inmem_pages *cur, *tmp;
+
+   list_for_each_entry_safe(cur, tmp, head, list) {
+   trace_f2fs_commit_inmem_page(cur->page, INMEM_DROP);
+
+   lock_page(cur->page);
+   ClearPageUptodate(cur->page);
+   set_page_private(cur->page, 0);
+   ClearPagePrivate(cur->page);
+   f2fs_put_page(cur->page, 1);
+
+   list_del(>list);
+   kmem_cache_free(inmem_entry_slab, cur);
+   dec_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
+   }
+}
+
+void drop_inmem_pages(struct inode *inode)
+{
+   struct f2fs_inode_info *fi = F2FS_I(inode);
+
+   mutex_lock(>inmem_lock);
+   __revoke_inmem_pages(inode, >inmem_pages);
+   mutex_unlock(>inmem_lock);
+}
+
+static int __commit_inmem_pages(struct inode *inode)
 {
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct f2fs_inode_info *fi = F2FS_I(inode);
struct inmem_pages *cur, *tmp;
-   bool submit_bio = false;
struct f2fs_io_info fio = {
.sbi = sbi,
.type = DATA,
.rw = WRITE_SYNC | REQ_PRIO,
.encrypted_page = NULL,
};
+   bool submit_bio = false;
int err = 0;
 
-   /*
-* The abort is true only when f2fs_evict_inode is called.
-* Basically, the f2fs_evict_inode doesn't produce any data writes, so
-* that we don't need to call f2fs_balance_fs.
-* Otherwise, f2fs_gc in f2fs_balance_fs can wait forever until this
-* inode becomes free by iget_locked in f2fs_iget.
-*/
-   if (!abort) {
-   f2fs_balance_fs(sbi, true);
-   f2fs_lock_op(sbi);
-   }
-
-

RE: [f2fs-dev] [PATCH 2/2] f2fs: support revoking atomic written pages

2016-02-05 Thread Chao Yu

Hi Jaegeuk,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> Sent: Saturday, February 06, 2016 12:18 PM
> To: Chao Yu
> Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: support revoking atomic written 
> pages
> 
> Hi Chao,
> 
> On Tue, Feb 02, 2016 at 06:19:06PM +0800, Chao Yu wrote:
> > > > > > > From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> > > > > > > Sent: Wednesday, January 13, 2016 9:18 AM
> > > > > > > To: Chao Yu
> > > > > > > Cc: linux-kernel@vger.kernel.org; 
> > > > > > > linux-f2fs-de...@lists.sourceforge.net
> > > > > > > Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: support revoking atomic 
> > > > > > > written pages
> > > > > > >
> > > > > > > Hi Chao,
> > > > > > >
> > > > > > > I just injected -EIO for one page among two pages in total into 
> > > > > > > database file.
> > > > > > > Then, I tested valid and invalid journal file to see how sqlite 
> > > > > > > recovers the
> > > > > > > transaction.
> > > > > > >
> > > > > > > Interestingly, if journal is valid, database file is recovered, 
> > > > > > > as I could see
> > > > > > > the transaction result even after it shows EIO.
> > > > > > > But, in the invalid journal case, somehow it drops database 
> > > > > > > changes.
> > > > > >
> > > > > > If journal has valid data in its header and corrupted data in its 
> > > > > > body, sqlite will
> > > > > > recover db file from corrupted journal file, then db file will be 
> > > > > > corrupted.
> > > > > > So what you mean is: after recovery, db file still be fine? or 
> > > > > > sqlite fails to
> > > > > > recover due to drop data in journal since the header of journal is 
> > > > > > not valid?
> > > > >
> > > > > In the above case, I think I made broken journal header. At the same 
> > > > > time, I
> > > > > broke database file too, but I could see that database file is 
> > > > > recovered
> > > > > likewise roll-back. I couldn't find corruption of database.
> > > > >
> > > > > Okay, I'll test again by corrupting journal body with valid header.
> > >
> > > Hmm, it's quite difficult to produce any corruption case.
> > >
> > > I tried the below tests, but in all the cases, sqlite did rollback 
> > > successfully.
> >
> > As you saw valid db file at final, I suspect that:
> > a) db file was recovered by f2fs: after we fail in atomic commit, if
> >checkpoint isn't be triggered to persist partial pages of one
> >transaction, db file will be recovered to last transaction after an
> >abnormal power-cut by f2fs.
> > b) or db file was recovered by sqlite: sqlite will try to do the
> >revoking after it detects failure of atomic commit. Similarly, db
> >file will be recovered.
> >
> > >
> > >  - -EIO for one db write with valid header + valid body in journal
> > >  - -EIO for one db write with valid header + invalid body in journal
> > >  - -EIO for one db write with invalid header + valid body in journal
> > >
> > > Note that, I checked both integrity_check and table contents after each 
> > > tests.
> > >
> > > I suspect that journal uses checksums to validate its contents?
> >
> > Yes, there is one checksum after each 4K-size journal page.
> >
> > IMO, it's better to just destroy last one or two journal pages to make
> > corrupted journal file. For example, if there are 10 pages in journal, let
> > kworker writebacks [0-7] pages include partial old pages of transaction
> > and journal header, and holds [8-9] pages in memory, so in disk, [8-9]
> > pages were invalid to sqlite due to wrong checksum, and other pages will
> > be judged as valid for recovery. Note that, pages after first invalid
> > page were also be judged as invalid by sqlite.
> 
> Hmm, I couldn't find out the exact scenario to corrypt db finally.
> But, when I took a look at the below document, I could agree that it is
> possible scenario.
> 
> https://www.sqlite.org/howtocorrupt.html
> 
> If possible, could you rebase the patches based on the latest dev-test?
> I want to review the patch seriously.

No problem, please help to review following patches. :)

Thanks,

> 
> Thanks,
> 
> >
> > Thanks,
> >
> > >
> > > Thanks,
> > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > > I'm not sure it was because I just skip second page write of 
> > > > > > > database file tho.
> > > > > > > (I added random bytes into journal pages.)
> > > > > > > I'll break the database file with more random bytes likewise what 
> > > > > > > I did for
> > > > > > > journal.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > On Fri, Jan 08, 2016 at 11:43:06AM -0800, Jaegeuk Kim wrote:
> > > > > > > > On Fri, Jan 08, 2016 at 08:05:52PM +0800, Chao Yu wrote:
> > > > > > > > > Hi Jaegeuk,
> > > > > > > > >
> > > > > > > > > Any progress on this patch?
> > > > > > > >
> > > > > > > > Swamped. Will do.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > >
> > >

Re: [PATCH 4/5] mm, oom_reaper: report success/failure

2016-02-05 Thread Michal Hocko

On Fri 05-02-16 10:26:40, Michal Hocko wrote:
[...]
> From 402090df64de7f80d7d045b0b17e860220837fa6 Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Fri, 5 Feb 2016 10:24:23 +0100
> Subject: [PATCH] mm-oom_reaper-report-success-failure-fix
> 
> update the log message to be more specific
> 
> Signed-off-by: Michal Hocko 
> ---
>  mm/oom_kill.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 87d644c97ac9..ca61e6cfae52 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -479,7 +479,7 @@ static bool __oom_reap_task(struct task_struct *tsk)
>   }
>   }
>   tlb_finish_mmu(, 0, -1);
> - pr_info("oom_reaper: reaped process :%d (%s) anon-rss:%lukB, 
> file-rss:%lukB, shmem-rss:%lulB\n",
> + pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
> file-rss:%lukB, shmem-rss:%lulB\n",

Dohh, s@lulB@ulkB@

>   task_pid_nr(tsk), tsk->comm,
>   K(get_mm_counter(mm, MM_ANONPAGES)),
>   K(get_mm_counter(mm, MM_FILEPAGES)),
> -- 
> 2.7.0
> 
> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs

Re: [PATCH 5/5] mm, oom_reaper: implement OOM victims queuing

2016-02-05 Thread Tetsuo Handa

Michal Hocko wrote:
> > But if we consider non system-wide OOM events, it is not very unlikely to 
> > hit
> > this race. This queue is useful for situations where memcg1 and memcg2 hit
> > memcg OOM at the same time and victim1 in memcg1 cannot terminate 
> > immediately.
> 
> This can happen of course but the likelihood is _much_ smaller without
> the global OOM because the memcg OOM killer is invoked from a lockless
> context so the oom context cannot block the victim to proceed.

Suppose mem_cgroup_out_of_memory() is called from a lockless context via
mem_cgroup_oom_synchronize() called from pagefault_out_of_memory(), that
"lockless" is talking about only current thread, doesn't it?

Since oom_kill_process() sets TIF_MEMDIE on first mm!=NULL thread of a
victim process, it is possible that non-first mm!=NULL thread triggers
pagefault_out_of_memory() and first mm!=NULL thread gets TIF_MEMDIE,
isn't it?

Then, where is the guarantee that victim1 (first mm!=NULL thread in memcg1
which got TIF_MEMDIE) is not waiting at down_read(>mm->mmap_sem)
when victim2 (first mm!=NULL thread in memcg2 which got TIF_MEMDIE) is
waiting at down_write(>mm->mmap_sem) or both victim1 and victim2
are waiting on a lock somewhere in memory reclaim path (e.g.
mutex_lock(>i_mutex))?

Re: [PATCH V3 5/5] rtc: max77686: move initialisation of rtc regmap, irq chip locally

2016-02-05 Thread Javier Martinez Canillas


Hello Laxman,

Sorry for not doing this before but today was a busy one.

On 02/05/2016 11:37 AM, Laxman Dewangan wrote:

Hi Krzysztof, Javier,

On Thursday 04 February 2016 02:38 PM, Krzysztof Kozlowski wrote:

On 04.02.2016 15:58, Krzysztof Kozlowski wrote:

3. Can you try locally to not use devm_regmap_init_i2c() and just use
the regmap_init_i2c() and proper removal of this from error path and
remove callback?

I'll try to find some time for that. Maybe tomorrow.

regmap_init_i2c does not help. However helps commenting out the:
regmap_del_irq_chip(info->rtc_irq, info->rtc_irq_data);
from remove() callback.




I am trying to reproduce this in my system but I am ending up with different 
issue as it need to enable the suspend.


Ok, no worries. I can of course help testing on my system.


can you please help on following experiment:
1. In probe/init, do
regmap_add_irq_chip()
   regmap_del_irq_chip() and then
   regmap_add_irq_chip() and
then without unbind()/bind(), working or not?

   This is to make sure that it is universal issue rather than calling from 
remove callback only.



This works, the system can S2R successfully.
 

2. Do regmap_add_irq_chip() but dont do any interrupt registration i.e. comment 
regmap_irq_get_virq() and request_threaded_irq() and hence free_irq().
Then do unbind/bind and then suspend.
   To make sure that this happen when only we have client registered interrupt.



This works as well.
 

3. Extension of 2
Do regmap_add_irq_chip(), call  regmap_irq_get_virq() for creating irq mapping 
but dont do any interrupt registration i.e. comment request_threaded_irq() and 
hence free_irq().
Then do unbind/bind and then suspend.
   To make sure that this happen when only we have client registered interrupt 
or with mapping also.



This fails, so the problem seems to be with the mapping.

So I tried another scenario:

4. Call regmap_del_irq_chip() just after regmap_irq_get_virq() and try to S2R
   without doing any unbind before.

   To test if this is a general issue with regmap_del_irq_chip() after doing
   the IRQ mapping and not something specific to the remove callback.

The machine failed to boot. So now at least we have narrowed down the issue.

I've looked at both regmap_irq_get_virq() and regmap_del_irq_chip() but I
couldn't find any obvious cause for the issue we are seeing. But it's late
Friday so probably I should just stop here and take a fresh look on Monday.

Best regards,
--
Javier Martinez Canillas
Open Source Group
Samsung Research America

[PATCH] Platform: goldfish: goldfish_pipe.c: Add DMA support using managed version

2016-02-05 Thread Shraddha Barke

setup_access_params_addr has 2 goals-

-Initialize the access_params field so that it can be used to send and read
commands from the device in access_with_param
-Get a bus address for the allocated memory to transfer to the device.

Replace the combination of devm_kzalloc and _pa() with dmam_alloc_coherent.
Coherent mapping guarantees that the device and CPU are in sync.

Signed-off-by: Shraddha Barke 
---
 drivers/platform/goldfish/goldfish_pipe.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
b/drivers/platform/goldfish/goldfish_pipe.c
index e7a29e2..4b0babb 100644
--- a/drivers/platform/goldfish/goldfish_pipe.c
+++ b/drivers/platform/goldfish/goldfish_pipe.c
@@ -57,6 +57,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * IMPORTANT: The following constants must match the ones used and defined
@@ -217,17 +218,16 @@ static int valid_batchbuffer_addr(struct 
goldfish_pipe_dev *dev,
 static int setup_access_params_addr(struct platform_device *pdev,
struct goldfish_pipe_dev *dev)
 {
-   u64 paddr;
+   dma_addr_t dma_handle;
struct access_params *aps;
 
-   aps = devm_kzalloc(>dev, sizeof(struct access_params), 
GFP_KERNEL);
+   aps = dmam_alloc_coherent(>dev, sizeof(struct access_params),
+ _handle, GFP_KERNEL);
if (!aps)
-   return -1;
+   return -ENOMEM;
 
-   /* FIXME */
-   paddr = __pa(aps);
-   writel((u32)(paddr >> 32), dev->base + PIPE_REG_PARAMS_ADDR_HIGH);
-   writel((u32)paddr, dev->base + PIPE_REG_PARAMS_ADDR_LOW);
+   writel(upper_32_bits(dma_handle), dev->base + 
PIPE_REG_PARAMS_ADDR_HIGH);
+   writel(lower_32_bits(dma_handle), dev->base + PIPE_REG_PARAMS_ADDR_LOW);
 
if (valid_batchbuffer_addr(dev, aps)) {
dev->aps = aps;
-- 
2.1.4

[PATCH v2] x86/boot: add BIT() to boot/bitops.h

2016-02-05 Thread Luis R. Rodriguez

The boot/bitops.h has guards against including the
regular bitops (include/asm-generic/bitops.h), it only
implements what we need at early boot. We'll be making
use of BIT() later so add it.

Users of boot/boot.h must include it prior to asm/setup.h
otherwise the guard protection devised against the regular
linux/bitops.h will not take effect.

v2: spelling fixes, and language descriptipon enhancements
by Konrad.

Signed-off-by: Luis R. Rodriguez 
---

This patch is originally part of a much larger series [0],
this is v2 of the original patch 3/8 [1]. I've split this single
patch out on its own now that it should be clear how I intend
on using BIT() on early code.

[0] 
http://lkml.kernel.org/r/1450217797-19295-1-git-send-email-mcg...@do-not-panic.com
[1] 
http://lkml.kernel.org/r/1450217797-19295-4-git-send-email-mcg...@do-not-panic.com

 arch/x86/boot/bitops.h | 2 ++
 arch/x86/boot/boot.h   | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/bitops.h b/arch/x86/boot/bitops.h
index 878e4b9940d9..232cff0ff4e3 100644
--- a/arch/x86/boot/bitops.h
+++ b/arch/x86/boot/bitops.h
@@ -40,4 +40,6 @@ static inline void set_bit(int nr, void *addr)
asm("btsl %1,%0" : "+m" (*(u32 *)addr) : "Ir" (nr));
 }
 
+#define BIT(x) (1 << x)
+
 #endif /* BOOT_BITOPS_H */
diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 9011a88353de..4fb53da1f48a 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -23,8 +23,8 @@
 #include 
 #include 
 #include 
-#include 
 #include "bitops.h"
+#include 
 #include "ctype.h"
 #include "cpuflags.h"
 
-- 
2.7.0

[PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-05 Thread Luis R. Rodriguez

paravirt_enabled conveys the idea that if this is set or if
paravirt_enabled() returns true you are in a paravirtualized
environment. This is not true by any means, and left as-is
is just causing confusion and is prone to be misused and abused.

This primitive is really only useful to determine if you have a
paravirtualization hypervisor that supports legacy paravirtualized
guests. At run time, this tells us if we've booted into a Linux guest
with support for legacy devices and features.

To avoid further issues with semantics on this we loosely borrow
the definition of "legacy" from both the ACPI 5.2.9.3 "IA-PC Boot
Architecture Flags" section and the PC 2001 definition in the PC
Systems design guide [0]:

  paravirt_legacy() is true if this hypervisor supports legacy
x86 paravirtualized guests.

Renaming the member and helper to make this clear and document
this well. With proper documentation now we can avoiding special
cased comments trying to explain what the heck this is.

[0] http://tech-insider.org/windows/research/2000/1102.html
[1] http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf

v2:

* Fix 0-day bot build issue on arch/x86/entry/entry_32.S
  where I forgot to update the upper cas eoffset name,
  defined in arch/x86/kernel/asm-offsets.c

* Add more documentation and references for what exactly is
  x86 legacy, and how this inspired the notion of a paravirt
  legacy device or feature.

* Use supports_x86_legacy on the struct member to make it
  clearer what this bool is for, keep the paravirt_legacy()
  from v1.

* Split out changes into a few patches to make it easier
  to review and test.

The rename is done using the following Coccinelle SmPL patch:

@ rename_paravirt_enabled @
@@

-paravirt_enabled()
+paravirt_legacy()

@ rename_pv_info_pv_enabled @
@@
-pv_info.paravirt_enabled
+pv_info.supports_x86_legacy

@ is_pv @
identifier pv;
@@
struct pv_info pv = {
};

@ rename_struct_pv_enabled depends on is_pv @
identifier is_pv.pv;
expression val;
@@

struct pv_info pv = {
-   .paravirt_enabled
+   .supports_x86_legacy
= val,
};

Generated-by: Coccinelle SmPL
Suggested-by: Konrad Rzeszutek Wilk 
Cc: Robert Moore 
Cc: Fengguang Wu 
Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/entry/entry_32.S |  2 +-
 arch/x86/include/asm/paravirt.h   |  6 +++---
 arch/x86/include/asm/paravirt_types.h | 35 +--
 arch/x86/include/asm/processor.h  |  2 +-
 arch/x86/kernel/apm_32.c  |  2 +-
 arch/x86/kernel/asm-offsets.c |  2 +-
 arch/x86/kernel/cpu/intel.c   |  2 +-
 arch/x86/kernel/cpu/microcode/core.c  |  2 +-
 arch/x86/kernel/head.c|  2 +-
 arch/x86/kernel/kvm.c |  9 +
 arch/x86/kernel/paravirt.c|  2 +-
 arch/x86/kernel/rtc.c |  2 +-
 arch/x86/kernel/tboot.c   |  2 +-
 arch/x86/lguest/boot.c|  4 ++--
 arch/x86/mm/dump_pagetables.c |  2 +-
 arch/x86/xen/enlighten.c  |  2 +-
 drivers/pnp/pnpbios/core.c|  2 +-
 17 files changed, 52 insertions(+), 28 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 4c5228352744..6a248022549c 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -395,7 +395,7 @@ ldt_ss:
 * is still available to implement the setting of the high
 * 16-bits in the INTERRUPT_RETURN paravirt-op.
 */
-   cmpl$0, pv_info+PARAVIRT_enabled
+   cmpl$0, pv_info+PARAVIRT_legacy
jne restore_nocheck
 #endif
 
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 6542aa99714b..b3885c1f2156 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -14,14 +14,14 @@
 #include 
 #include 
 
-static inline bool paravirt_enabled(void)
+static inline bool paravirt_legacy(void)
 {
-   return pv_info.paravirt_enabled;
+   return pv_info.supports_x86_legacy;
 }
 
 static inline bool paravirt_has_feature(unsigned int feature)
 {
-   WARN_ON_ONCE(!paravirt_enabled());
+   WARN_ON_ONCE(!paravirt_legacy());
return !!(pv_info.features & feature);
 }
 
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index de2382b023f2..b4094a57435d 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -61,6 +61,37 @@ struct paravirt_callee_save {
 };
 
 /* general info */
+
+/**
+ * struct pv_info - paravirt hypervisor information
+ *
+ * @supports_x86_legacy: true if this hypervisor supports legacy x86
+ * paravirtualized guests.  The definition of legacy here adheres
+ * *loosely* to both the notion of legacy in the ACPI 5.2.9.3 "IA-PC Boot
+ * Architecture Flags" section and the PC 2001 "legacy free" concept [1]
+ * referred to in the PC System Design Guide [2] [3] on Chapter 3, Page 50
+ * [4].  Legacy x86

[PATCH v2 1/3] paravirt: use bool for paravirt_enabled() and paravirt_has_feature()

2016-02-05 Thread Luis R. Rodriguez

This avoids any possible misuse.

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/include/asm/paravirt.h   | 6 +++---
 arch/x86/include/asm/paravirt_types.h | 2 +-
 arch/x86/include/asm/processor.h  | 4 ++--
 arch/x86/kernel/kvm.c | 2 +-
 arch/x86/kernel/paravirt.c| 2 +-
 arch/x86/lguest/boot.c| 2 +-
 arch/x86/xen/enlighten.c  | 2 +-
 7 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index f6192502149e..60a71dfe0c4e 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -14,15 +14,15 @@
 #include 
 #include 
 
-static inline int paravirt_enabled(void)
+static inline bool paravirt_enabled(void)
 {
return pv_info.paravirt_enabled;
 }
 
-static inline int paravirt_has_feature(unsigned int feature)
+static inline bool paravirt_has_feature(unsigned int feature)
 {
WARN_ON_ONCE(!pv_info.paravirt_enabled);
-   return (pv_info.features & feature);
+   return !!(pv_info.features & feature);
 }
 
 static inline void load_sp0(struct tss_struct *tss,
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 77db5616a473..de2382b023f2 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -69,7 +69,7 @@ struct pv_info {
u16 extra_user_64bit_cs;  /* __USER_CS if none */
 #endif
 
-   int paravirt_enabled;
+   bool paravirt_enabled;
unsigned int features;/* valid only if paravirt_enabled is set */
const char *name;
 };
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 491a3d9dbb15..5a8e7a61d5be 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -470,8 +470,8 @@ static inline unsigned long current_top_of_stack(void)
 #include 
 #else
 #define __cpuidnative_cpuid
-#define paravirt_enabled() 0
-#define paravirt_has(x)0
+#define paravirt_enabled() false
+#define paravirt_has(x)false
 
 static inline void load_sp0(struct tss_struct *tss,
struct thread_struct *thread)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 47190bd399e7..5c717b247e1b 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -290,7 +290,7 @@ static void __init paravirt_ops_setup(void)
 * features, and paravirt_enabled is about features that are
 * missing.
 */
-   pv_info.paravirt_enabled = 0;
+   pv_info.paravirt_enabled = false;
 
if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
pv_cpu_ops.io_delay = kvm_io_delay;
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index f08ac28b8136..6b1f205a6ac7 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -294,7 +294,7 @@ enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
 
 struct pv_info pv_info = {
.name = "bare hardware",
-   .paravirt_enabled = 0,
+   .paravirt_enabled = false,
.kernel_rpl = 0,
.shared_kernel_pmd = 1, /* Only used when CONFIG_X86_PAE is set */
 
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index a9033ae13369..c6f302f6dedb 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -1409,7 +1409,7 @@ __init void lguest_init(void)
/* We're under lguest. */
pv_info.name = "lguest";
/* Paravirt is enabled. */
-   pv_info.paravirt_enabled = 1;
+   pv_info.paravirt_enabled = true;
/* We're running at privilege level 1, not 0 as normal. */
pv_info.kernel_rpl = 1;
/* Everyone except Xen runs with this set. */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 2c261082eadf..e303e0043881 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1186,7 +1186,7 @@ static unsigned xen_patch(u8 type, u16 clobbers, void 
*insnbuf,
 }
 
 static const struct pv_info xen_info __initconst = {
-   .paravirt_enabled = 1,
+   .paravirt_enabled = true,
.shared_kernel_pmd = 0,
 
 #ifdef CONFIG_X86_64
-- 
2.7.0

[PATCH v2 0/3] paravirt: rebrand paravirt_enabled as paravirt_legacy

2016-02-05 Thread Luis R. Rodriguez

There's been confusion both in code and by developers as to what
the paravirt_enabled thing means. This sets to clarify this to help
build stronger semantics on our bootup process.

This was originally suggested by Konrad and I included this as part of
a larger patch set [0]. I've decided to break that single rename patch (1/8)
out into a 3 smaller patches to both make it easier to review and to help
with regression testing, should any issues arise.

[0] 
http://lkml.kernel.org/r/1450217797-19295-1-git-send-email-mcg...@do-not-panic.com
[1] 
http://lkml.kernel.org/r/1450217797-19295-2-git-send-email-mcg...@do-not-panic.com

Luis R. Rodriguez (3):
  paravirt: use bool for paravirt_enabled() and paravirt_has_feature()
  paravirt: replace direct access to pv_info.paravirt_enabled
  paravirt: rename paravirt_enabled to paravirt_legacy

 arch/x86/entry/entry_32.S |  2 +-
 arch/x86/include/asm/paravirt.h   | 10 +-
 arch/x86/include/asm/paravirt_types.h | 35 +--
 arch/x86/include/asm/processor.h  |  4 ++--
 arch/x86/kernel/apm_32.c  |  2 +-
 arch/x86/kernel/asm-offsets.c |  2 +-
 arch/x86/kernel/cpu/intel.c   |  2 +-
 arch/x86/kernel/cpu/microcode/core.c  |  2 +-
 arch/x86/kernel/head.c|  2 +-
 arch/x86/kernel/kvm.c |  9 +
 arch/x86/kernel/paravirt.c|  2 +-
 arch/x86/kernel/rtc.c |  2 +-
 arch/x86/kernel/tboot.c   |  2 +-
 arch/x86/lguest/boot.c|  4 ++--
 arch/x86/mm/dump_pagetables.c |  2 +-
 arch/x86/xen/enlighten.c  |  2 +-
 drivers/pnp/pnpbios/core.c|  2 +-
 17 files changed, 55 insertions(+), 31 deletions(-)

-- 
2.7.0

[PATCH v2 2/3] paravirt: replace direct access to pv_info.paravirt_enabled

2016-02-05 Thread Luis R. Rodriguez

Use helper, its why its there.

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/include/asm/paravirt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 60a71dfe0c4e..6542aa99714b 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -21,7 +21,7 @@ static inline bool paravirt_enabled(void)
 
 static inline bool paravirt_has_feature(unsigned int feature)
 {
-   WARN_ON_ONCE(!pv_info.paravirt_enabled);
+   WARN_ON_ONCE(!paravirt_enabled());
return !!(pv_info.features & feature);
 }
 
-- 
2.7.0

Re: [f2fs-dev] [PATCH 2/2] f2fs: support revoking atomic written pages

2016-02-05 Thread Jaegeuk Kim

Hi Chao,

On Tue, Feb 02, 2016 at 06:19:06PM +0800, Chao Yu wrote:
> > > > > > From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> > > > > > Sent: Wednesday, January 13, 2016 9:18 AM
> > > > > > To: Chao Yu
> > > > > > Cc: linux-kernel@vger.kernel.org; 
> > > > > > linux-f2fs-de...@lists.sourceforge.net
> > > > > > Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: support revoking atomic 
> > > > > > written pages
> > > > > >
> > > > > > Hi Chao,
> > > > > >
> > > > > > I just injected -EIO for one page among two pages in total into 
> > > > > > database file.
> > > > > > Then, I tested valid and invalid journal file to see how sqlite 
> > > > > > recovers the
> > > > > > transaction.
> > > > > >
> > > > > > Interestingly, if journal is valid, database file is recovered, as 
> > > > > > I could see
> > > > > > the transaction result even after it shows EIO.
> > > > > > But, in the invalid journal case, somehow it drops database changes.
> > > > >
> > > > > If journal has valid data in its header and corrupted data in its 
> > > > > body, sqlite will
> > > > > recover db file from corrupted journal file, then db file will be 
> > > > > corrupted.
> > > > > So what you mean is: after recovery, db file still be fine? or sqlite 
> > > > > fails to
> > > > > recover due to drop data in journal since the header of journal is 
> > > > > not valid?
> > > >
> > > > In the above case, I think I made broken journal header. At the same 
> > > > time, I
> > > > broke database file too, but I could see that database file is recovered
> > > > likewise roll-back. I couldn't find corruption of database.
> > > >
> > > > Okay, I'll test again by corrupting journal body with valid header.
> > 
> > Hmm, it's quite difficult to produce any corruption case.
> > 
> > I tried the below tests, but in all the cases, sqlite did rollback 
> > successfully.
> 
> As you saw valid db file at final, I suspect that:
> a) db file was recovered by f2fs: after we fail in atomic commit, if
>checkpoint isn't be triggered to persist partial pages of one
>transaction, db file will be recovered to last transaction after an
>abnormal power-cut by f2fs.
> b) or db file was recovered by sqlite: sqlite will try to do the
>revoking after it detects failure of atomic commit. Similarly, db
>file will be recovered.
> 
> > 
> >  - -EIO for one db write with valid header + valid body in journal
> >  - -EIO for one db write with valid header + invalid body in journal
> >  - -EIO for one db write with invalid header + valid body in journal
> > 
> > Note that, I checked both integrity_check and table contents after each 
> > tests.
> > 
> > I suspect that journal uses checksums to validate its contents?
> 
> Yes, there is one checksum after each 4K-size journal page.
> 
> IMO, it's better to just destroy last one or two journal pages to make
> corrupted journal file. For example, if there are 10 pages in journal, let
> kworker writebacks [0-7] pages include partial old pages of transaction
> and journal header, and holds [8-9] pages in memory, so in disk, [8-9]
> pages were invalid to sqlite due to wrong checksum, and other pages will
> be judged as valid for recovery. Note that, pages after first invalid
> page were also be judged as invalid by sqlite.

Hmm, I couldn't find out the exact scenario to corrypt db finally.
But, when I took a look at the below document, I could agree that it is
possible scenario.

https://www.sqlite.org/howtocorrupt.html

If possible, could you rebase the patches based on the latest dev-test?
I want to review the patch seriously.

Thanks,

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> > > >
> > > > Thanks,
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > > I'm not sure it was because I just skip second page write of 
> > > > > > database file tho.
> > > > > > (I added random bytes into journal pages.)
> > > > > > I'll break the database file with more random bytes likewise what I 
> > > > > > did for
> > > > > > journal.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > On Fri, Jan 08, 2016 at 11:43:06AM -0800, Jaegeuk Kim wrote:
> > > > > > > On Fri, Jan 08, 2016 at 08:05:52PM +0800, Chao Yu wrote:
> > > > > > > > Hi Jaegeuk,
> > > > > > > >
> > > > > > > > Any progress on this patch?
> > > > > > >
> > > > > > > Swamped. Will do.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > > -Original Message-
> > > > > > > > > From: Chao Yu [mailto:c...@kernel.org]
> > > > > > > > > Sent: Friday, January 01, 2016 8:14 PM
> > > > > > > > > To: Jaegeuk Kim
> > > > > > > > > Cc: linux-kernel@vger.kernel.org; 
> > > > > > > > > linux-f2fs-de...@lists.sourceforge.net
> > > > > > > > > Subject: Re: [f2fs-dev] [PATCH 2/2] f2fs: support revoking 
> > > > > > > > > atomic written pages
> > > > > > > > >
> > > > > > > > > Hi Jaegeuk,
> > > > > > > > >
> > > > > > > > > On 1/1/16 11:50 AM, Jaegeuk Kim wrote:
> > > > > > > > > > Hi Chao,
> > > > > > > > > >

Re: [PATCH v2 1/2] ARM: bcm2835: dt: Add the ethernet to the device trees

2016-02-05 Thread Stephen Warren

On 02/04/2016 12:36 AM, Lubomir Rintel wrote:
> The hub and the ethernet in its port 1 are hardwired on the board.
> 
> Compared to the adapters that can be plugged into the USB ports, this
> one has no serial EEPROM to store its MAC. Nevertheless, the Raspberry Pi
> has the MAC address for this adapter in its ROM, accessible from its
> firmware.
> 
> U-Boot can read out the address and set the local-mac-address property of the
> node with "ethernet" alias. Let's add the node so that U-Boot can do its
> business.
> 
> Model B rev2 and Model B+ entries were verified by me, the hierarchy and
> pid/vid pair for the Version 2 was provided by Olivier Blin. Original
> Model B is a blind short, though very likely correct.

The series,
Tested-by: Stephen Warren 

A few nits though...

>  arch/arm/boot/dts/bcm2835-rpi-b-plus.dts | 18 ++
>  arch/arm/boot/dts/bcm2835-rpi-b-rev2.dts | 18 ++
>  arch/arm/boot/dts/bcm2835-rpi-b.dts  | 18 ++
>  arch/arm/boot/dts/bcm2836-rpi-2-b.dts| 18 ++
>  arch/arm/boot/dts/bcm283x.dtsi   |  4 +++-

Rather than cut/paste everything, can't we share the duplicate content
using a *.dtsi file? IIRC some dtsi files already exist. Perhaps there
could be a bcm283x-rpi-smsc9512.dtsi and bcm283x-rpi-smsc9514.dtsi, and
even a bcm283x-rpi-smsc-eth.dtsi since 99% of those two are common?
Hopefully that doesn't sound like busy work/bike-shedding too much.

> diff --git a/arch/arm/boot/dts/bcm2835-rpi-b-plus.dts 
> b/arch/arm/boot/dts/bcm2835-rpi-b-plus.dts

> + {
> + usb1@1 {
> + compatible = "usb0424,9514";
> + reg = <01>;

Here and ...

> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + ethernet: usbether@1 {
> + compatible = "usb0424,ec00";
> + reg = <01>;

... here, reg should be "1" not "01". Same in all the files.

> diff --git a/arch/arm/boot/dts/bcm2835-rpi-b-rev2.dts 
> b/arch/arm/boot/dts/bcm2835-rpi-b-rev2.dts

> + {
> + usb1@1 {
> + compatible = "usb0424,9512";

I don't think that ID is correct. On my systems, I have:

RPi B (original, rev 1, 2 USB ports): 9512
RPi B (rev 2 w/ P5, 2 USB ports): 9512
RPi B+ (4 USB ports): 9514
RPi 2 (4 USB ports): 9514

Re: Bisected Regression 4.3.5 => 4.4.1 booting HP ZBook in EFI mode

2016-02-05 Thread Phil Turmel

On 02/05/2016 08:09 PM, Greg Kroah-Hartman wrote:

> Ah, you have versioned modules / builds enabled, that's what caused the
> rebuild, if you disable CONFIG_MODVERSIONS and
> CONFIG_MODULE_SRCVERSION_ALL you shouldn't rebuild everything.
> 
> If those options are disabled, then something really odd is going on
> here...

So, MODVERSIONS was off, but MODULE_SRCVERSIONS_ALL and
LOCALVERSION_AUTO were on.

Repeating with those turned off, nothing was rebuilt.

That was bootable, too.  So I turned them on individually
and each was still bootable.  I verified the config matched
the originals I started with.

So I cleaned my tree and started over building v4.3.5 then v4.4.1.
They both booted.  So I can no longer reproduce this.

Sorry for the noise.

Phil

[PATCH 01/14] ACPI / OSL: Cleanup initrd table override code

2016-02-05 Thread Lv Zheng

This patch cleans up initrd table override code, merging redundant logics
and re-ordering code blocks. No functional changes.

Signed-off-by: Lv Zheng 
---
 drivers/acpi/osl.c |  114 
 1 file changed, 52 insertions(+), 62 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 67da6fb..abfc06b 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -602,6 +602,14 @@ acpi_os_predefined_override(const struct 
acpi_predefined_names *init_val,
return AE_OK;
 }
 
+static void acpi_table_taint(struct acpi_table_header *table)
+{
+   pr_warn(PREFIX
+   "Override [%4.4s-%8.8s], this is unsafe: tainting kernel\n",
+   table->signature, table->oem_table_id);
+   add_taint(TAINT_OVERRIDDEN_ACPI_TABLE, LOCKDEP_NOW_UNRELIABLE);
+}
+
 #ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE
 #include 
 #include 
@@ -746,96 +754,78 @@ void __init acpi_initrd_override(void *data, size_t size)
}
}
 }
-#endif /* CONFIG_ACPI_INITRD_TABLE_OVERRIDE */
-
-static void acpi_table_taint(struct acpi_table_header *table)
-{
-   pr_warn(PREFIX
-   "Override [%4.4s-%8.8s], this is unsafe: tainting kernel\n",
-   table->signature, table->oem_table_id);
-   add_taint(TAINT_OVERRIDDEN_ACPI_TABLE, LOCKDEP_NOW_UNRELIABLE);
-}
-
-
-acpi_status
-acpi_os_table_override(struct acpi_table_header * existing_table,
-  struct acpi_table_header ** new_table)
-{
-   if (!existing_table || !new_table)
-   return AE_BAD_PARAMETER;
-
-   *new_table = NULL;
-
-#ifdef CONFIG_ACPI_CUSTOM_DSDT
-   if (strncmp(existing_table->signature, "DSDT", 4) == 0)
-   *new_table = (struct acpi_table_header *)AmlCode;
-#endif
-   if (*new_table != NULL)
-   acpi_table_taint(existing_table);
-   return AE_OK;
-}
 
 acpi_status
 acpi_os_physical_table_override(struct acpi_table_header *existing_table,
-   acpi_physical_address *address,
-   u32 *table_length)
+   acpi_physical_address *address, u32 *length)
 {
-#ifndef CONFIG_ACPI_INITRD_TABLE_OVERRIDE
-   *table_length = 0;
-   *address = 0;
-   return AE_OK;
-#else
int table_offset = 0;
struct acpi_table_header *table;
+   u32 table_length;
 
-   *table_length = 0;
+   *length = 0;
*address = 0;
-
if (!acpi_tables_addr)
return AE_OK;
 
-   do {
-   if (table_offset + ACPI_HEADER_SIZE > all_tables_size) {
-   WARN_ON(1);
-   return AE_OK;
-   }
-
+   while (table_offset + ACPI_HEADER_SIZE <= all_tables_size) {
table = acpi_os_map_memory(acpi_tables_addr + table_offset,
   ACPI_HEADER_SIZE);
-
if (table_offset + table->length > all_tables_size) {
acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
WARN_ON(1);
return AE_OK;
}
 
-   table_offset += table->length;
+   table_length = table->length;
 
-   if (memcmp(existing_table->signature, table->signature, 4)) {
-   acpi_os_unmap_memory(table,
-ACPI_HEADER_SIZE);
-   continue;
-   }
-
-   /* Only override tables with matching oem id */
-   if (memcmp(table->oem_table_id, existing_table->oem_table_id,
+   /* Only override tables matched */
+   if (memcmp(existing_table->signature, table->signature, 4) ||
+   memcmp(table->oem_table_id, existing_table->oem_table_id,
   ACPI_OEM_TABLE_ID_SIZE)) {
-   acpi_os_unmap_memory(table,
-ACPI_HEADER_SIZE);
-   continue;
+   acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+   goto next_table;
}
 
-   table_offset -= table->length;
-   *table_length = table->length;
-   acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+   *length = table_length;
*address = acpi_tables_addr + table_offset;
+   acpi_table_taint(existing_table);
+   acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
break;
-   } while (table_offset + ACPI_HEADER_SIZE < all_tables_size);
 
-   if (*address != 0)
-   acpi_table_taint(existing_table);
+next_table:
+   table_offset += table_length;
+   }
return AE_OK;
+}
+#else
+acpi_status
+acpi_os_physical_table_override(struct acpi_table_header *existing_table,
+   acpi_physical_address *address,
+

[PATCH 08/14] ACPI 2.0 / ECDT: Split EC_FLAGS_HANDLERS_INSTALLED

2016-02-05 Thread Lv Zheng

This patch splits EC_FLAGS_HANDLERS_INSTALLED so that address space handler
can be installed when it is not possible to install GPE handler during
early stage.
This patch also tunes address space handler installation, making it
happening earlier than GPE handler installation for the same purpose.

Since acpi_ec_start()/acpi_ec_stop() will be entered multiple times after
applying this change, it is also required to protect acpi_enable_gpe()/
acpi_disable_gpe() invocations.

Signed-off-by: Lv Zheng 
---
 drivers/acpi/ec.c |   96 ++---
 1 file changed, 55 insertions(+), 41 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index b420fb4..b8f474b 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -105,8 +105,8 @@ enum ec_command {
 enum {
EC_FLAGS_QUERY_PENDING, /* Query is pending */
EC_FLAGS_QUERY_GUARDING,/* Guard for SCI_EVT check */
-   EC_FLAGS_HANDLERS_INSTALLED,/* Handlers for GPE and
-* OpReg are installed */
+   EC_FLAGS_GPE_HANDLER_INSTALLED, /* GPE handler installed */
+   EC_FLAGS_EC_HANDLER_INSTALLED,  /* OpReg handler installed */
EC_FLAGS_STARTED,   /* Driver is started */
EC_FLAGS_STOPPED,   /* Driver is stopped */
EC_FLAGS_COMMAND_STORM, /* GPE storms occurred to the
@@ -367,7 +367,8 @@ static inline void acpi_ec_clear_gpe(struct acpi_ec *ec)
 static void acpi_ec_submit_request(struct acpi_ec *ec)
 {
ec->reference_count++;
-   if (ec->reference_count == 1)
+   if (test_bit(EC_FLAGS_GPE_HANDLER_INSTALLED, >flags) &&
+   ec->reference_count == 1)
acpi_ec_enable_gpe(ec, true);
 }
 
@@ -376,7 +377,8 @@ static void acpi_ec_complete_request(struct acpi_ec *ec)
bool flushed = false;
 
ec->reference_count--;
-   if (ec->reference_count == 0)
+   if (test_bit(EC_FLAGS_GPE_HANDLER_INSTALLED, >flags) &&
+   ec->reference_count == 0)
acpi_ec_disable_gpe(ec, true);
flushed = acpi_ec_flushed(ec);
if (flushed)
@@ -1287,52 +1289,64 @@ static int ec_install_handlers(struct acpi_ec *ec)
 {
acpi_status status;
 
-   if (test_bit(EC_FLAGS_HANDLERS_INSTALLED, >flags))
-   return 0;
-   status = acpi_install_gpe_raw_handler(NULL, ec->gpe,
- ACPI_GPE_EDGE_TRIGGERED,
- _ec_gpe_handler, ec);
-   if (ACPI_FAILURE(status))
-   return -ENODEV;
-
acpi_ec_start(ec, false);
-   status = acpi_install_address_space_handler(ec->handle,
-   ACPI_ADR_SPACE_EC,
-   _ec_space_handler,
-   NULL, ec);
-   if (ACPI_FAILURE(status)) {
-   if (status == AE_NOT_FOUND) {
-   /*
-* Maybe OS fails in evaluating the _REG object.
-* The AE_NOT_FOUND error will be ignored and OS
-* continue to initialize EC.
-*/
-   pr_err("Fail in evaluating the _REG object"
-   " of EC device. Broken bios is suspected.\n");
-   } else {
-   acpi_ec_stop(ec, false);
-   acpi_remove_gpe_handler(NULL, ec->gpe,
-   _ec_gpe_handler);
-   return -ENODEV;
+
+   if (!test_bit(EC_FLAGS_EC_HANDLER_INSTALLED, >flags)) {
+   status = acpi_install_address_space_handler(ec->handle,
+   ACPI_ADR_SPACE_EC,
+   
_ec_space_handler,
+   NULL, ec);
+   if (ACPI_FAILURE(status)) {
+   if (status == AE_NOT_FOUND) {
+   /*
+* Maybe OS fails in evaluating the _REG
+* object. The AE_NOT_FOUND error will be
+* ignored and OS * continue to initialize
+* EC.
+*/
+   pr_err("Fail in evaluating the _REG object"
+   " of EC device. Broken bios is 
suspected.\n");
+   } else {
+   acpi_ec_stop(ec, false);
+   return -ENODEV;
+   }
+   }
+   set_bit(EC_FLAGS_EC_HANDLER_INSTALLED, >flags);
+   }
+
+   if (!test_bit(EC_FLAGS_GPE_HANDLER_INSTALLED, >flags)) {
+   status = acpi_install_gpe_raw_handler(NULL, ec->gpe,
+

[PATCH 02/14] ACPI / OSL: Add support to install tables via initrd

2016-02-05 Thread Lv Zheng

This patch adds support to install tables from initrd.

If a table in the initrd wasn't used by the override mechanism, the table
would be installed after initializing all RSDT/XSDT tables.

Reference: https://lkml.org/lkml/2014/2/28/368
Reported-by: Thomas Renninger 
Signed-off-by: Lv Zheng 
---
 drivers/acpi/internal.h |1 +
 drivers/acpi/osl.c  |   50 ++-
 drivers/acpi/tables.c   |2 ++
 3 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 1e6833a..27c2cb9 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -20,6 +20,7 @@
 
 #define PREFIX "ACPI: "
 
+void acpi_initrd_initialize_tables(void);
 acpi_status acpi_os_initialize1(void);
 void init_acpi_device_notify(void);
 int acpi_scan_init(void);
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index abfc06b..26f1dc8 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -644,6 +644,7 @@ static const char * const table_sigs[] = {
 
 #define ACPI_OVERRIDE_TABLES 64
 static struct cpio_data __initdata acpi_initrd_files[ACPI_OVERRIDE_TABLES];
+static DECLARE_BITMAP(acpi_initrd_installed, ACPI_OVERRIDE_TABLES);
 
 #define MAP_CHUNK_SIZE   (NR_FIX_BTMAPS << PAGE_SHIFT)
 
@@ -760,6 +761,7 @@ acpi_os_physical_table_override(struct acpi_table_header 
*existing_table,
acpi_physical_address *address, u32 *length)
 {
int table_offset = 0;
+   int table_index = 0;
struct acpi_table_header *table;
u32 table_length;
 
@@ -780,7 +782,8 @@ acpi_os_physical_table_override(struct acpi_table_header 
*existing_table,
table_length = table->length;
 
/* Only override tables matched */
-   if (memcmp(existing_table->signature, table->signature, 4) ||
+   if (test_bit(table_index, acpi_initrd_installed) ||
+   memcmp(existing_table->signature, table->signature, 4) ||
memcmp(table->oem_table_id, existing_table->oem_table_id,
   ACPI_OEM_TABLE_ID_SIZE)) {
acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
@@ -791,13 +794,54 @@ acpi_os_physical_table_override(struct acpi_table_header 
*existing_table,
*address = acpi_tables_addr + table_offset;
acpi_table_taint(existing_table);
acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+   set_bit(table_index, acpi_initrd_installed);
break;
 
 next_table:
table_offset += table_length;
+   table_index++;
}
return AE_OK;
 }
+
+void __init acpi_initrd_initialize_tables(void)
+{
+   int table_offset = 0;
+   int table_index = 0;
+   u32 table_length;
+   struct acpi_table_header *table;
+
+   if (!acpi_tables_addr)
+   return;
+
+   while (table_offset + ACPI_HEADER_SIZE <= all_tables_size) {
+   table = acpi_os_map_memory(acpi_tables_addr + table_offset,
+  ACPI_HEADER_SIZE);
+   if (table_offset + table->length > all_tables_size) {
+   acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+   WARN_ON(1);
+   return;
+   }
+
+   table_length = table->length;
+
+   /* Skip RSDT/XSDT which should only be used for override */
+   if (test_bit(table_index, acpi_initrd_installed) ||
+   ACPI_COMPARE_NAME(table->signature, ACPI_SIG_RSDT) ||
+   ACPI_COMPARE_NAME(table->signature, ACPI_SIG_XSDT)) {
+   acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+   goto next_table;
+   }
+
+   acpi_table_taint(table);
+   acpi_os_unmap_memory(table, ACPI_HEADER_SIZE);
+   acpi_install_table(acpi_tables_addr + table_offset, TRUE);
+   set_bit(table_index, acpi_initrd_installed);
+next_table:
+   table_offset += table_length;
+   table_index++;
+   }
+}
 #else
 acpi_status
 acpi_os_physical_table_override(struct acpi_table_header *existing_table,
@@ -808,6 +852,10 @@ acpi_os_physical_table_override(struct acpi_table_header 
*existing_table,
*address = 0;
return AE_OK;
 }
+
+void __init acpi_initrd_initialize_tables(void)
+{
+}
 #endif /* CONFIG_ACPI_INITRD_TABLE_OVERRIDE */
 
 acpi_status
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 6c0f079..57c0a45 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include "internal.h"
 
 #define ACPI_MAX_TABLES128
 
@@ -456,6 +457,7 @@ int __init acpi_table_init(void)
status = acpi_initialize_tables(initial_tables, ACPI_MAX_TABLES, 0);
if (ACPI_FAILURE(status))
return -EINVAL;

[PATCH 03/14] ACPI 2.0 / AML: Make default region accessible during the table load

2016-02-05 Thread Lv Zheng

ACPICA commit 016b2a0917cca9cf0d40c38a1541017d9cf569dd

It is proven that the default regions should be accessible during the
table loading in order to execute module level AML code.
This patch moves default region handler installation code earlier in
order to make this happen.
Note that by putting the code here, we actually allow OSPMs to override
default region handlers between acpi_initialize_subsystem() and
acpi_load_tables(), without the need to introduce region handler override
mechanism in acpi_install_address_space_handler(). OSPMs are also couraged
to check acpi_install_address_space_handler() return value to determine if
acpi_remove_address_space_handler() should be invoked before installing new
address space handler. Lv Zheng.

Link: https://github.com/acpica/acpica/commit/016b2a09
Signed-off-by: Lv Zheng 
---
 drivers/acpi/acpica/tbxfload.c |   22 ++
 drivers/acpi/acpica/utxfinit.c |   12 +++-
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/acpica/tbxfload.c b/drivers/acpi/acpica/tbxfload.c
index 278666e..12068ba 100644
--- a/drivers/acpi/acpica/tbxfload.c
+++ b/drivers/acpi/acpica/tbxfload.c
@@ -47,6 +47,7 @@
 #include "accommon.h"
 #include "acnamesp.h"
 #include "actables.h"
+#include "acevents.h"
 
 #define _COMPONENT  ACPI_TABLES
 ACPI_MODULE_NAME("tbxfload")
@@ -68,6 +69,27 @@ acpi_status __init acpi_load_tables(void)
 
ACPI_FUNCTION_TRACE(acpi_load_tables);
 
+   /*
+* Install the default operation region handlers. These are the
+* handlers that are defined by the ACPI specification to be
+* "always accessible" -- namely, system_memory, system_IO, and
+* PCI_Config. This also means that no _REG methods need to be
+* run for these address spaces. We need to have these handlers
+* installed before any AML code can be executed, especially any
+* module-level code (11/2015).
+* Note that we allow OSPMs to install their own region handlers
+* between acpi_initialize_subsystem() and acpi_load_tables() to use
+* their customized default region handlers.
+*/
+   if (!acpi_gbl_group_module_level_code) {
+   status = acpi_ev_install_region_handlers();
+   if (ACPI_FAILURE(status) && status != AE_ALREADY_EXISTS) {
+   ACPI_EXCEPTION((AE_INFO, status,
+   "During Region initialization"));
+   return_ACPI_STATUS(status);
+   }
+   }
+
/* Load the namespace from the tables */
 
status = acpi_tb_load_namespace();
diff --git a/drivers/acpi/acpica/utxfinit.c b/drivers/acpi/acpica/utxfinit.c
index 721b87c..22f50c2 100644
--- a/drivers/acpi/acpica/utxfinit.c
+++ b/drivers/acpi/acpica/utxfinit.c
@@ -163,11 +163,13 @@ acpi_status __init acpi_enable_subsystem(u32 flags)
 * installed before any AML code can be executed, especially any
 * module-level code (11/2015).
 */
-   status = acpi_ev_install_region_handlers();
-   if (ACPI_FAILURE(status)) {
-   ACPI_EXCEPTION((AE_INFO, status,
-   "During Region initialization"));
-   return_ACPI_STATUS(status);
+   if (acpi_gbl_group_module_level_code) {
+   status = acpi_ev_install_region_handlers();
+   if (ACPI_FAILURE(status)) {
+   ACPI_EXCEPTION((AE_INFO, status,
+   "During Region initialization"));
+   return_ACPI_STATUS(status);
+   }
}
 #if (!ACPI_REDUCED_HARDWARE)
 
-- 
1.7.10

[PATCH 04/14] ACPI 2.0 / AML: Tune _REG evaluations order in the initialization steps

2016-02-05 Thread Lv Zheng

ACPICA commit 77e0c7a482ac30ef857cf3c33d075e5fe5b5e449

This patch tunes _REG evaluations to be later than all table loading
facilities:
1. acpi_load_tables(): _REG is currently invoked after this function.
2. acpi_ns_exec_module_code_list(): this executes module level code, the
   execution should be a part of the table loading while we currently
   support this in a deferred way.
3. acpi_ns_initialize_objects(): this parses Region/Field/Buffer/Package where
   pkg_length primitive can be seen in the grammar, the parsing should be a
   part of the table loading while we currently support this in a deferred
   way.
Control method evaluation should happen after loading the tables. So this
patch changes the order of _REG evaluation when
acpi_gbl_group_module_level_code experiment is enabled. Lv Zheng.

Link: https://github.com/acpica/acpica/commit/77e0c7a4
Signed-off-by: Lv Zheng 
---
 drivers/acpi/acpica/utxfinit.c |   35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/acpi/acpica/utxfinit.c b/drivers/acpi/acpica/utxfinit.c
index 22f50c2..2caaec7 100644
--- a/drivers/acpi/acpica/utxfinit.c
+++ b/drivers/acpi/acpica/utxfinit.c
@@ -262,23 +262,6 @@ acpi_status __init acpi_initialize_objects(u32 flags)
 
ACPI_FUNCTION_TRACE(acpi_initialize_objects);
 
-   /*
-* Run all _REG methods
-*
-* Note: Any objects accessed by the _REG methods will be automatically
-* initialized, even if they contain executable AML (see the call to
-* acpi_ns_initialize_objects below).
-*/
-   acpi_gbl_reg_methods_enabled = TRUE;
-   if (!(flags & ACPI_NO_ADDRESS_SPACE_INIT)) {
-   ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
- "[Init] Executing _REG OpRegion methods\n"));
-
-   status = acpi_ev_initialize_op_regions();
-   if (ACPI_FAILURE(status)) {
-   return_ACPI_STATUS(status);
-   }
-   }
 #ifdef ACPI_EXEC_APP
/*
 * This call implements the "initialization file" option for acpi_exec.
@@ -319,6 +302,24 @@ acpi_status __init acpi_initialize_objects(u32 flags)
}
 
/*
+* Run all _REG methods
+*
+* Note: Any objects accessed by the _REG methods will be automatically
+* initialized, even if they contain executable AML (see the call to
+* acpi_ns_initialize_objects below).
+*/
+   acpi_gbl_reg_methods_enabled = TRUE;
+   if (!(flags & ACPI_NO_ADDRESS_SPACE_INIT)) {
+   ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
+ "[Init] Executing _REG OpRegion methods\n"));
+
+   status = acpi_ev_initialize_op_regions();
+   if (ACPI_FAILURE(status)) {
+   return_ACPI_STATUS(status);
+   }
+   }
+
+   /*
 * Initialize all device objects in the namespace. This runs the device
 * _STA and _INI methods.
 */
-- 
1.7.10

[PATCH 05/14] ACPI 2.0 / AML: Ensure \_SB._INI executed before any _REG

2016-02-05 Thread Lv Zheng

ACPICA commit 8ae25b8d128b6b8509010be321ff6bf2760f3807
ACPICA commit 19f84c249267fab0bfb138bd14d12510fb4faf24

There is BIOS code relying on the fact that \_SB._INI should get evaluated
before any other control methods. This may imply a gap in ACPICA/Linux
initialization/enumeration process.

Before revealing Windows true behavior by more validations, this patch only
ensures \_SB._INI evaluated before any _REG control methods. This can help
to make progress to other initialization order fixes. Lv Zheng.

Link: https://github.com/acpica/acpica/commit/8ae25b8d
Link: https://github.com/acpica/acpica/commit/19f84c24
Signed-off-by: Lv Zheng 
Signed-off-by: Bob Moore 
---
 drivers/acpi/acpica/acnamesp.h |2 +-
 drivers/acpi/acpica/nsinit.c   |  135 
 drivers/acpi/acpica/utxfinit.c |   27 ++--
 3 files changed, 86 insertions(+), 78 deletions(-)

diff --git a/drivers/acpi/acpica/acnamesp.h b/drivers/acpi/acpica/acnamesp.h
index 9684ed6..022d69c 100644
--- a/drivers/acpi/acpica/acnamesp.h
+++ b/drivers/acpi/acpica/acnamesp.h
@@ -88,7 +88,7 @@
  */
 acpi_status acpi_ns_initialize_objects(void);
 
-acpi_status acpi_ns_initialize_devices(void);
+acpi_status acpi_ns_initialize_devices(u32 flags);
 
 /*
  * nsload -  Namespace loading
diff --git a/drivers/acpi/acpica/nsinit.c b/drivers/acpi/acpica/nsinit.c
index bd75d46..f029a3d 100644
--- a/drivers/acpi/acpica/nsinit.c
+++ b/drivers/acpi/acpica/nsinit.c
@@ -46,6 +46,7 @@
 #include "acnamesp.h"
 #include "acdispat.h"
 #include "acinterp.h"
+#include "acevents.h"
 
 #define _COMPONENT  ACPI_NAMESPACE
 ACPI_MODULE_NAME("nsinit")
@@ -133,82 +134,108 @@ acpi_status acpi_ns_initialize_objects(void)
  *
  
**/
 
-acpi_status acpi_ns_initialize_devices(void)
+acpi_status acpi_ns_initialize_devices(u32 flags)
 {
-   acpi_status status;
+   acpi_status status = AE_OK;
struct acpi_device_walk_info info;
 
ACPI_FUNCTION_TRACE(ns_initialize_devices);
 
-   /* Init counters */
+   if (!(flags & ACPI_NO_DEVICE_INIT)) {
+   ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
+ "[Init] Initializing ACPI Devices\n"));
 
-   info.device_count = 0;
-   info.num_STA = 0;
-   info.num_INI = 0;
+   /* Init counters */
 
-   ACPI_DEBUG_PRINT_RAW((ACPI_DB_INIT,
- "Initializing Device/Processor/Thermal objects "
- "and executing _INI/_STA methods:\n"));
+   info.device_count = 0;
+   info.num_STA = 0;
+   info.num_INI = 0;
 
-   /* Tree analysis: find all subtrees that contain _INI methods */
+   ACPI_DEBUG_PRINT_RAW((ACPI_DB_INIT,
+ "Initializing Device/Processor/Thermal 
objects "
+ "and executing _INI/_STA methods:\n"));
 
-   status = acpi_ns_walk_namespace(ACPI_TYPE_ANY, ACPI_ROOT_OBJECT,
-   ACPI_UINT32_MAX, FALSE,
-   acpi_ns_find_ini_methods, NULL, ,
-   NULL);
-   if (ACPI_FAILURE(status)) {
-   goto error_exit;
-   }
+   /* Tree analysis: find all subtrees that contain _INI methods */
+
+   status = acpi_ns_walk_namespace(ACPI_TYPE_ANY, ACPI_ROOT_OBJECT,
+   ACPI_UINT32_MAX, FALSE,
+   acpi_ns_find_ini_methods, NULL,
+   , NULL);
+   if (ACPI_FAILURE(status)) {
+   goto error_exit;
+   }
+
+   /* Allocate the evaluation information block */
 
-   /* Allocate the evaluation information block */
+   info.evaluate_info =
+   ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_evaluate_info));
+   if (!info.evaluate_info) {
+   status = AE_NO_MEMORY;
+   goto error_exit;
+   }
 
-   info.evaluate_info =
-   ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_evaluate_info));
-   if (!info.evaluate_info) {
-   status = AE_NO_MEMORY;
-   goto error_exit;
+   /*
+* Execute the "global" _INI method that may appear at the root.
+* This support is provided for Windows compatibility (Vista+) 
and
+* is not part of the ACPI specification.
+*/
+   info.evaluate_info->prefix_node = acpi_gbl_root_node;
+   info.evaluate_info->relative_pathname = METHOD_NAME__INI;
+   info.evaluate_info->parameters = NULL;
+   info.evaluate_info->flags = ACPI_IGNORE_RETURN_VALUE;
+
+   status = acpi_ns_evaluate(info.evaluate_info);
+

[PATCH 14/14] ACPI 2.0 / AML: Fix module level execution by correctly parsing table as TermList

2016-02-05 Thread Lv Zheng

This experiment follows de-facto standard behavior, parsing entire
table as a single TermList, so that all module level executions are
possible during the table loading.

If regressions are found against the enabling of this fix, this patch is
the only one should get bisected. Please report the regressions to the
kernel bugzilla for further root causing.

Signed-off-by: Lv Zheng 
---
 include/acpi/acpixf.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
index 19c98a3..fd07392 100644
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -199,7 +199,7 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, 
FALSE);
  * a term_list.
  * For disassembler, this should be FALSE.
  */
-ACPI_INIT_GLOBAL(u8, acpi_gbl_parse_table_as_term_list, FALSE);
+ACPI_INIT_GLOBAL(u8, acpi_gbl_parse_table_as_term_list, TRUE);
 
 /*
  * Optionally use 32-bit FADT addresses if and when there is a conflict
-- 
1.7.10

[PATCH 07/14] ACPICA: Events: Fix an issue that _REG association can happen before namespace is initialized

2016-02-05 Thread Lv Zheng

Current code flow cannot ensure _REG association can happen after the
namespace is initialized, so we move _REG association to where _REG was
about to run to fix this issue.

This issue is detected when acpi_ev_initialize_region() is invoked during
the table loading. And this is one of the most important the root cause why
ACPICA table loading is split into 2 load passes. Lv Zheng.

Signed-off-by: Lv Zheng 
---
 drivers/acpi/acpica/acevents.h |2 --
 drivers/acpi/acpica/evregion.c |   71 ++--
 drivers/acpi/acpica/evrgnini.c |1 -
 3 files changed, 24 insertions(+), 50 deletions(-)

diff --git a/drivers/acpi/acpica/acevents.h b/drivers/acpi/acpica/acevents.h
index 010cf81..17f2217 100644
--- a/drivers/acpi/acpica/acevents.h
+++ b/drivers/acpi/acpica/acevents.h
@@ -198,8 +198,6 @@ void
 acpi_ev_detach_region(union acpi_operand_object *region_obj,
  u8 acpi_ns_is_locked);
 
-void acpi_ev_associate_reg_method(union acpi_operand_object *region_obj);
-
 void
 acpi_ev_execute_reg_methods(struct acpi_namespace_node *node,
acpi_adr_space_type space_id, u32 function);
diff --git a/drivers/acpi/acpica/evregion.c b/drivers/acpi/acpica/evregion.c
index 63924d1..72f553c 100644
--- a/drivers/acpi/acpica/evregion.c
+++ b/drivers/acpi/acpica/evregion.c
@@ -526,81 +526,58 @@ acpi_ev_attach_region(union acpi_operand_object 
*handler_obj,
 
 
/***
  *
- * FUNCTION:acpi_ev_associate_reg_method
+ * FUNCTION:acpi_ev_execute_reg_method
  *
  * PARAMETERS:  region_obj  - Region object
+ *  function- Passed to _REG: On (1) or Off (0)
  *
  * RETURN:  Status
  *
- * DESCRIPTION: Find and associate _REG method to a region
+ * DESCRIPTION: Execute _REG method for a region
  *
  
**/
 
-void acpi_ev_associate_reg_method(union acpi_operand_object *region_obj)
+acpi_status
+acpi_ev_execute_reg_method(union acpi_operand_object *region_obj, u32 function)
 {
+   struct acpi_evaluate_info *info;
+   union acpi_operand_object *args[3];
+   union acpi_operand_object *region_obj2;
acpi_name *reg_name_ptr = (acpi_name *) METHOD_NAME__REG;
struct acpi_namespace_node *method_node;
struct acpi_namespace_node *node;
-   union acpi_operand_object *region_obj2;
acpi_status status;
 
-   ACPI_FUNCTION_TRACE(ev_associate_reg_method);
+   ACPI_FUNCTION_TRACE(ev_execute_reg_method);
+
+   if (!acpi_gbl_namespace_initialized ||
+   region_obj->region.handler == NULL) {
+   return_ACPI_STATUS(AE_OK);
+   }
 
region_obj2 = acpi_ns_get_secondary_object(region_obj);
if (!region_obj2) {
-   return_VOID;
+   return_ACPI_STATUS(AE_NOT_EXIST);
}
 
+   /*
+* Find any "_REG" method associated with this region definition.
+* The method should always be updated as this function may be
+* invoked after a namespace change.
+*/
node = region_obj->region.node->parent;
-
-   /* Find any "_REG" method associated with this region definition */
-
status =
acpi_ns_search_one_scope(*reg_name_ptr, node, ACPI_TYPE_METHOD,
 _node);
if (ACPI_SUCCESS(status)) {
/*
-* The _REG method is optional and there can be only one per 
region
-* definition. This will be executed when the handler is 
attached
-* or removed
+* The _REG method is optional and there can be only one per
+* region definition. This will be executed when the handler is
+* attached or removed.
 */
region_obj2->extra.method_REG = method_node;
}
-
-   return_VOID;
-}
-
-/***
- *
- * FUNCTION:acpi_ev_execute_reg_method
- *
- * PARAMETERS:  region_obj  - Region object
- *  function- Passed to _REG: On (1) or Off (0)
- *
- * RETURN:  Status
- *
- * DESCRIPTION: Execute _REG method for a region
- *
- 
**/
-
-acpi_status
-acpi_ev_execute_reg_method(union acpi_operand_object *region_obj, u32 function)
-{
-   struct acpi_evaluate_info *info;
-   union acpi_operand_object *args[3];
-   union acpi_operand_object *region_obj2;
-   acpi_status status;
-
-   ACPI_FUNCTION_TRACE(ev_execute_reg_method);
-
-   region_obj2 = acpi_ns_get_secondary_object(region_obj);
-   if (!region_obj2) {
-   return_ACPI_STATUS(AE_NOT_EXIST);
-   }
-
-   if (region_obj2->extra.method_REG == NULL ||
-   region_obj->region.handler ==

[PATCH 13/14] ACPI 2.0 / AML: Enable correct ACPI subsystem initialization order for new table loading mode

2016-02-05 Thread Lv Zheng

This patch enables the following initialization order for the new table
loading mode (which is enabled by setting
acpi_gbl_parse_table_as_term_list to TRUE):
  1. Install default region handlers (SystemMemory, SystemIo, PciConfig,
 EmbeddedControl via ECDT) without evaluating _REG;
  2. Load the table and execute the module level AML opcodes instantly.

Signed-off-by: Lv Zheng 
---
 drivers/acpi/bus.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 8752860..0d6bb22 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -911,7 +911,8 @@ void __init acpi_early_init(void)
goto error0;
}
 
-   if (acpi_gbl_group_module_level_code) {
+   if (!acpi_gbl_parse_table_as_term_list &&
+   acpi_gbl_group_module_level_code) {
status = acpi_load_tables();
if (ACPI_FAILURE(status)) {
printk(KERN_ERR PREFIX
@@ -994,7 +995,8 @@ static int __init acpi_bus_init(void)
status = acpi_ec_ecdt_probe();
/* Ignore result. Not having an ECDT is not fatal. */
 
-   if (!acpi_gbl_group_module_level_code) {
+   if (acpi_gbl_parse_table_as_term_list ||
+   !acpi_gbl_group_module_level_code) {
status = acpi_load_tables();
if (ACPI_FAILURE(status)) {
printk(KERN_ERR PREFIX
-- 
1.7.10

[PATCH 06/14] ACPI 2.0 / AML: Rename acpi_gbl_reg_methods_enabled to acpi_gbl_namespace_initialized

2016-02-05 Thread Lv Zheng

ACPICA commit 4be3b82cf45d324366ea8567102d5108c5ef47cb

The global variable actually means the availability of the namespace, and
control methods evaluations should happen after namespace readiness. Thus
this patch renames the global variable to reflect this logic. Lv Zheng.

Link: https://github.com/acpica/acpica/commit/4be3b82c
Signed-off-by: Lv Zheng 
---
 drivers/acpi/acpica/acglobal.h |2 +-
 drivers/acpi/acpica/evregion.c |2 +-
 drivers/acpi/acpica/utxfinit.c |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpica/acglobal.h b/drivers/acpi/acpica/acglobal.h
index 55c8197..51b073b 100644
--- a/drivers/acpi/acpica/acglobal.h
+++ b/drivers/acpi/acpica/acglobal.h
@@ -165,7 +165,7 @@ ACPI_GLOBAL(u8, acpi_gbl_next_owner_id_offset);
 
 /* Initialization sequencing */
 
-ACPI_INIT_GLOBAL(u8, acpi_gbl_reg_methods_enabled, FALSE);
+ACPI_INIT_GLOBAL(u8, acpi_gbl_namespace_initialized, FALSE);
 
 /* Misc */
 
diff --git a/drivers/acpi/acpica/evregion.c b/drivers/acpi/acpica/evregion.c
index 47092b4..63924d1 100644
--- a/drivers/acpi/acpica/evregion.c
+++ b/drivers/acpi/acpica/evregion.c
@@ -600,7 +600,7 @@ acpi_ev_execute_reg_method(union acpi_operand_object 
*region_obj, u32 function)
 
if (region_obj2->extra.method_REG == NULL ||
region_obj->region.handler == NULL ||
-   !acpi_gbl_reg_methods_enabled) {
+   !acpi_gbl_namespace_initialized) {
return_ACPI_STATUS(AE_OK);
}
 
diff --git a/drivers/acpi/acpica/utxfinit.c b/drivers/acpi/acpica/utxfinit.c
index 6a0c2ee..139f65f 100644
--- a/drivers/acpi/acpica/utxfinit.c
+++ b/drivers/acpi/acpica/utxfinit.c
@@ -301,7 +301,7 @@ acpi_status __init acpi_initialize_objects(u32 flags)
}
}
 
-   acpi_gbl_reg_methods_enabled = TRUE;
+   acpi_gbl_namespace_initialized = TRUE;
 
/*
 * Initialize all device/region objects in the namespace. This runs
-- 
1.7.10

[PATCH 10/14] ACPI 2.0 / ECDT: Enable correct ECDT initialization order

2016-02-05 Thread Lv Zheng

With wrong ECDT fixes reversed, it is able to put ECDT probing before
acpi_enable_subsystem().

But the ultimate purpose of ECDT re-enabling is to put the ECDT probing
before the namespace initialization (acpi_load_tables()). This patch
achieves this with protections so that we can enable it later when all
necessary corrections are upstreamed.

Signed-off-by: Lv Zheng  
Signed-off-by: Lv Zheng 
---
 drivers/acpi/bus.c |   39 +--
 1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 891c42d..8752860 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -911,11 +911,13 @@ void __init acpi_early_init(void)
goto error0;
}
 
-   status = acpi_load_tables();
-   if (ACPI_FAILURE(status)) {
-   printk(KERN_ERR PREFIX
-  "Unable to load the System Description Tables\n");
-   goto error0;
+   if (acpi_gbl_group_module_level_code) {
+   status = acpi_load_tables();
+   if (ACPI_FAILURE(status)) {
+   printk(KERN_ERR PREFIX
+  "Unable to load the System Description 
Tables\n");
+   goto error0;
+   }
}
 
 #ifdef CONFIG_X86
@@ -981,17 +983,10 @@ static int __init acpi_bus_init(void)
 
acpi_os_initialize1();
 
-   status = acpi_enable_subsystem(ACPI_NO_ACPI_ENABLE);
-   if (ACPI_FAILURE(status)) {
-   printk(KERN_ERR PREFIX
-  "Unable to start the ACPI Interpreter\n");
-   goto error1;
-   }
-
/*
 * ACPI 2.0 requires the EC driver to be loaded and work before
-* the EC device is found in the namespace (i.e. before 
acpi_initialize_objects()
-* is called).
+* the EC device is found in the namespace (i.e. before
+* acpi_load_tables() is called).
 *
 * This is accomplished by looking for the ECDT table, and getting
 * the EC parameters out of that.
@@ -999,6 +994,22 @@ static int __init acpi_bus_init(void)
status = acpi_ec_ecdt_probe();
/* Ignore result. Not having an ECDT is not fatal. */
 
+   if (!acpi_gbl_group_module_level_code) {
+   status = acpi_load_tables();
+   if (ACPI_FAILURE(status)) {
+   printk(KERN_ERR PREFIX
+  "Unable to load the System Description 
Tables\n");
+   goto error1;
+   }
+   }
+
+   status = acpi_enable_subsystem(ACPI_NO_ACPI_ENABLE);
+   if (ACPI_FAILURE(status)) {
+   printk(KERN_ERR PREFIX
+  "Unable to start the ACPI Interpreter\n");
+   goto error1;
+   }
+
status = acpi_initialize_objects(ACPI_FULL_INITIALIZATION);
if (ACPI_FAILURE(status)) {
printk(KERN_ERR PREFIX "Unable to initialize ACPI objects\n");
-- 
1.7.10

[PATCH 12/14] ACPI 2.0 / AML: Add TermList parsing support for table loading

2016-02-05 Thread Lv Zheng

It is proven that not only If/Else/While opcodes, all opcodes can be
executed at the module level, including operation region accesses.  BIOSen
developers are responsible to protect operation region accesses when the
operation region drivers are not loaded by the OSPMs (their availabilities
are indicated via _REG(CONNECT)).
So in fact, all spec allowed early operation region accesses should be
possible at the module level. This includes default operation regions
(SystemMemory, SystemIo, PciConfig) and early operation regions
(EmbeddedControl reported via ECDT).

And the above facts indeed reflect the spec words around ACPI definition
block tables (DSDT/SSDT/...), the table is defined by the AML specification
in BNF style as AMLCode:
  AMLCode := DefBlockHeader TermList
Where DefBlockHeader is the header of the DSDT/SSDT table. So table loading
should be no difference than the control method evaluations as the body of
the control method is also defined by the AML specification as TermList:
  DefMethod := MethodOp PkgLength NameString MethodFlags TermList
The only difference is: after evaluating control method, created named
objects can be freed due to no reference, while named objects created by
loading the table should only be freed after the table is unloaded.

So this patch follows the spec and the de-facto standard behavior, enables
TermList evaluations when the table is loaded. Lv Zheng.

Signed-off-by: Lv Zheng 
---
 drivers/acpi/acpica/acnamesp.h |3 +
 drivers/acpi/acpica/acparser.h |2 +
 drivers/acpi/acpica/exconfig.c |7 +-
 drivers/acpi/acpica/nsload.c   |3 +-
 drivers/acpi/acpica/nsparse.c  |  163 
 drivers/acpi/acpica/psparse.c  |4 +-
 drivers/acpi/acpica/psxface.c  |   73 ++
 drivers/acpi/acpica/tbxfload.c |3 +-
 drivers/acpi/acpica/utxfinit.c |6 +-
 include/acpi/acpixf.h  |7 ++
 10 files changed, 233 insertions(+), 38 deletions(-)

diff --git a/drivers/acpi/acpica/acnamesp.h b/drivers/acpi/acpica/acnamesp.h
index 022d69c..28398b0 100644
--- a/drivers/acpi/acpica/acnamesp.h
+++ b/drivers/acpi/acpica/acnamesp.h
@@ -130,6 +130,9 @@ acpi_status
 acpi_ns_parse_table(u32 table_index, struct acpi_namespace_node *start_node);
 
 acpi_status
+acpi_ns_execute_table(u32 table_index, struct acpi_namespace_node *start_node);
+
+acpi_status
 acpi_ns_one_complete_parse(u32 pass_number,
   u32 table_index,
   struct acpi_namespace_node *start_node);
diff --git a/drivers/acpi/acpica/acparser.h b/drivers/acpi/acpica/acparser.h
index 7da639d..ec396e4 100644
--- a/drivers/acpi/acpica/acparser.h
+++ b/drivers/acpi/acpica/acparser.h
@@ -78,6 +78,8 @@ extern const u8 acpi_gbl_long_op_index[];
  */
 acpi_status acpi_ps_execute_method(struct acpi_evaluate_info *info);
 
+acpi_status acpi_ps_execute_table(struct acpi_evaluate_info *info);
+
 /*
  * psargs - Parse AML opcode arguments
  */
diff --git a/drivers/acpi/acpica/exconfig.c b/drivers/acpi/acpica/exconfig.c
index 011df21..1af9e4c 100644
--- a/drivers/acpi/acpica/exconfig.c
+++ b/drivers/acpi/acpica/exconfig.c
@@ -108,8 +108,10 @@ acpi_ex_add_table(u32 table_index,
 
/* Add the table to the namespace */
 
+   acpi_ex_exit_interpreter();
status = acpi_ns_load_table(table_index, parent_node);
if (ACPI_FAILURE(status)) {
+   acpi_ex_enter_interpreter();
acpi_ut_remove_reference(obj_desc);
*ddb_handle = NULL;
return_ACPI_STATUS(status);
@@ -117,8 +119,9 @@ acpi_ex_add_table(u32 table_index,
 
/* Execute any module-level code that was found in the table */
 
-   acpi_ex_exit_interpreter();
-   acpi_ns_exec_module_code_list();
+   if (acpi_gbl_parse_table_as_term_list) {
+   acpi_ns_exec_module_code_list();
+   }
acpi_ex_enter_interpreter();
 
/*
diff --git a/drivers/acpi/acpica/nsload.c b/drivers/acpi/acpica/nsload.c
index 75cdb87..f2b19bb 100644
--- a/drivers/acpi/acpica/nsload.c
+++ b/drivers/acpi/acpica/nsload.c
@@ -162,7 +162,8 @@ unlock:
 * other ACPI implementations. Optionally, the execution can be deferred
 * until later, see acpi_initialize_objects.
 */
-   if (!acpi_gbl_group_module_level_code) {
+   if (!acpi_gbl_parse_table_as_term_list
+   && !acpi_gbl_group_module_level_code) {
acpi_ns_exec_module_code_list();
}
 
diff --git a/drivers/acpi/acpica/nsparse.c b/drivers/acpi/acpica/nsparse.c
index f631a47..557e590 100644
--- a/drivers/acpi/acpica/nsparse.c
+++ b/drivers/acpi/acpica/nsparse.c
@@ -47,12 +47,103 @@
 #include "acparser.h"
 #include "acdispat.h"
 #include "actables.h"
+#include "acinterp.h"
 
 #define _COMPONENT  ACPI_NAMESPACE
 ACPI_MODULE_NAME("nsparse")
 
 
/***
  *
+ * FUNCTION:ns_execute_table
+ *
+ * PARAMETERS:

[PATCH 11/14] ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis

2016-02-05 Thread Lv Zheng

This experiment moves module level If/Else/While executions to per-table
basis.

If regressions are found against the enabling of this improvement, this
patch is the only one should get bisected. Please report the regressions
to the kernel bugzilla for further root causing.

Signed-off-by: Lv Zheng 
---
 include/acpi/acpixf.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
index c96621e..a98455c 100644
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -192,7 +192,7 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_do_not_use_xsdt, FALSE);
 /*
  * Optionally support group module level code.
  */
-ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, TRUE);
+ACPI_INIT_GLOBAL(u8, acpi_gbl_group_module_level_code, FALSE);
 
 /*
  * Optionally use 32-bit FADT addresses if and when there is a conflict
-- 
1.7.10

[PATCH 09/14] ACPI 2.0 / ECDT: Remove early namespace reference from EC

2016-02-05 Thread Lv Zheng

All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.

Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
 "The presence of this table allows OSPM to provide Embedded Controller
  operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
 Device (EC)
 {
 Name (REGC, Ones)
 Method (_REG, 2)
 {
 If (LEqual (Arg0, 3))
 {
 Store (Arg1, REGC)
 }
 }
 Method (ECAV)
 {
 If (LEqual (REGC, Ones))
 {
 If (LGreaterEqual (_REV, 2))
 {
 Return (One)
 }
 Else
 {
 Return (Zero)
 }
 }
 Else
 {
 Return (REGC)
 }
 }
 }
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.

But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
   first control method evaluated by OSPM during the enumeration, this
   allows EC accesses to happen for the entire enumeration process before
   the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
   happen during the table loading.

Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.

The wrong bug fixing story is:
1. Link 1:
   At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
   before the namespace is ready" are not ensured by ACPICA core and Linux.
   This is currently ensured by deferred enabling of GPE and defered
   registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
   Reporters reported buggy ECDTs, expecting quirks for the platform.
   Originally, the quirk is simple, only doing things with ECDT.
   Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
   reported to have wrong ECDT IO port addresses, the port addresses are
   reversed.
   Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
   addresses, we can see that all EC accesses are protected by ECAV on
   this platform, so actually no early EC accesses is required by this
   platform.
3. Link 3:
   But when the bug fixing developer was requested to provide a handy and
   non-quirk bug fix, he tried to use correct EC settings from namespace
   and broke the spec purpose. We can even see that the developer was
   suffered from many regrssions. One interesting one is 14086, where the
   actual root cause obviously should be: _REG is evaluated too early. But
   unfortunately, the bug is fixed in a totally wrong way.

So everything goes wrong from these commits:
   Commit: c6cb0e878446c79f42e7833d7bb69ed6bfbb381f
   Subject: ACPI: EC: Don't trust ECDT tables from ASUS
   Commit: a5032bfdd9c80e0231a6324661e123818eb46ecd
   Subject: ACPI: EC: Always parse EC device

This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.

Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916
http://bugzilla.kernel.org/show_bug.cgi?id=10100
https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399
https://bugzilla.kernel.org/show_bug.cgi?id=12461
https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884
https://bugzilla.kernel.org/show_bug.cgi?id=14081
https://bugzilla.kernel.org/show_bug.cgi?id=14086
https://bugzilla.kernel.org/show_bug.cgi?id=14446
Signed-off-by: Lv Zheng 
---
 drivers/acpi/ec.c |  145 -
 1 file changed, 54 insertions(+), 91 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index b8f474b..0e70181 100644
--- a/drivers/acpi/ec.c
+++

[PATCH 00/14] ACPI 2.0: Enable TermList interpretion and early opeartion region accesses for table loading

2016-02-05 Thread Lv Zheng

The following AML code is assembled into a static loading SSDT, and used
as an instrumentation to pry into the de-facto standard AML interpreter
behaviors:
  Name (ECOK, Zero)
  Scope (\)
  {
  DBUG ("TermList 1")
  If (LEqual (ECOK, Zero))
  {
  DBUG ("TermList 2")
  Device (MDEV)
  {
  DEBUG (TermList 3")
  If (CondRefOf (MDEV))
  {
  DBUG ("MDEV exists")
  }
  If (CondRefOf (MDEV._STA))
  {
  DBUG ("MDEV._STA exists")
  }
  If (CondRefOf (\_SB.PCI0.EC))
  {
  DBUG ("\\_SB.PCI0.EC exists")
  }
  Name (_HID, EisaId ("PNP"))
  Method (_STA, 0, Serialized)
  {
  DEBUG ("\\_SB.MDEV._STA")
  Return (0x0F)
  }
  }
  DBUG ("TermList 4")
  }
  Method (_INI, 0, Serialized)
  {
  DBUG ("\\_SB._INI")
  }
  }
  Scope (_SB.PCI0)
  {
  Device (EC)
  {
  ...
  }
  }
The DBUG function is a function to write the debugging messages into a
SystemIo debug port.
Running Windows with the BIOS providing this SSDT via RSDT, the following
messages are obtained from the debug port:
  TermList 1
  TermList 2
  TermList 3
  \_SB.MDEV exists
  TermList 4
  \_SB._INI
  ...

This test reveals the de-facto table loading behaviors to us:
1. During the table loading, AML opcodes out of a control method (this is
   called as module level code in ACPICA term) will be executed by the
   interpreter;
2. Not only the module level AML opcodes wrapped around by If/Else/While
   (this is the current ACPICA interpreter limitation), but all module
   level AML opcodes, including the operation region accesses, method
   invocations, will be executed by the interpreter;
3. Not only the module level AML opcodes put under the root scope (this is
   another current ACPICA interpreter limitation), but the module level AML
   opcodes under any scope will be executed by the interpreter;
4. Not only after the table loading (this is another current ACPICA
   interpreter limitation), but when the table is being loaded, the module
   level AML opcodes will be executed by the interpreter, please refer to
   the above CondRefOf validations;
5. For SystemIo, not only after the _REG(1, 1) is evaluated (this is
   another current ACPICA interpreter limitation), but when the table is
   being loaded, the SystemIo (the debugging port) is accessible.

In fact, the above behaviors has already been clarified in ACPI 2.0
specification, so the compliance issue is not that Linux is not compliant
to the de-facto standard OS, but that Linux is not compliant to ACPI 2.0.
1. Definition tables in fact is defined by the spec as TermList, which has
   no difference than the control methods:
 AMLCode := DefBlockHeader TermList
 DefMethod := MethodOp PkgLength NameString MethodFlags TermList
   Thus the interpretion of the table should be no difference that the
   control method evaluation;
2. Spec allows the default operation regions to be accessed before the
   namespace is ready (which exactly means the table loading), such
   operation regions include SystemIo, SystemMemory, PciConfig and
   EmbeddedControl provided by ECDT, note that ECDT is also an ACPI 2.0
   feature.

Why ACPICA interpreter is acting so differently from this definition?
This is because, there are many software entropies preventing this from
being enabled, such entropies need to be cleaned up first in order not
to trigger regressions for specific platforms. These entropies include:
1. ECDT support is broken. In fact, the original EC driver was correct, but
   devlopers start to use the namespace EC instead of ECDT just because
   several broken ECDT tables were reported on the bugzilla. They trusted
   the namespace EC settings rather than the ECDT ones, this leads to the
   evaluation of _REG/_GPE/_CRS and namespace walk before executing the
   module level AML opcodes. And this in fact disables early EC usages
   (used during table loading and early device enumeration processes).
2. _REG evaluations are wrong. ACPICA provides APIs for OSPMs to register
   operation region handlers. But for the early operation region accesses,
   ACPI spec declares that the evaluations of _REG are not required, but
   the ACPICA API doesn't avoid running _REG to meet this early
   requirements. Code to fix this is partially upstreamed during previous
   ACPICA release cycle.
3. _REG associations are wrong. ACPICA associate _REG control method to
   all operation region objects before executing the _REG control method.
   This can happen even when a control method is evaluated and operation
   regions defined in the method is initialized
   (acpi_ev_initialize_region). As a part of the ACPICA internal _REG
   evaluation state machine, it requires the namespace walk, and all

[PATCH 3/3 v4] cpufreq: governor: Replace timers with utilization update callbacks

2016-02-05 Thread Rafael J. Wysocki

From: Rafael J. Wysocki 

Instead of using a per-CPU deferrable timer for queuing up governor
work items, register a utilization update callback that will be
invoked from the scheduler on utilization changes.

The sampling rate is still the same as what was used for the
deferrable timers and the added irq_work overhead should be offset by
the eliminated timers overhead, so in theory the functional impact of
this patch should not be significant.

Signed-off-by: Rafael J. Wysocki 
---

Updated after the recent discussion with Viresh.

Changes from v3:
- The completion used for irq_work synchronization replaced with irq_work_sync()
  in gov_cancel_work().
- update_sampling_rate() now modifies shared->sample_delay_ns for all CPUs
  where it matters directly with a big fat comment explaining why this is
  actually OK.
- The above means the time_stamp field in struct cpu_common_dbs_info is not
  necessary any more, so it is dropped.
- A build error for !CONFIG_SMP is addressed (hopefully effectively).

This version was lightly tested on an x86 laptop.

Thanks!

---
 drivers/cpufreq/cpufreq_conservative.c |6 -
 drivers/cpufreq/cpufreq_governor.c |  164 +++--
 drivers/cpufreq/cpufreq_governor.h |   19 ++-
 drivers/cpufreq/cpufreq_ondemand.c |   43 
 4 files changed, 112 insertions(+), 120 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -18,6 +18,7 @@
 #define _CPUFREQ_GOVERNOR_H
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -138,11 +139,19 @@ struct cpu_common_dbs_info {
 */
struct mutex timer_mutex;
 
-   ktime_t time_stamp;
+   u64 last_sample_time;
+   s64 sample_delay_ns;
atomic_t skip_work;
+   struct irq_work irq_work;
struct work_struct work;
 };
 
+static inline void gov_update_sample_delay(struct cpu_common_dbs_info *shared,
+  unsigned int delay_us)
+{
+   shared->sample_delay_ns = delay_us * NSEC_PER_USEC;
+}
+
 /* Per cpu structures */
 struct cpu_dbs_info {
u64 prev_cpu_idle;
@@ -155,7 +164,7 @@ struct cpu_dbs_info {
 * wake-up from idle.
 */
unsigned int prev_load;
-   struct timer_list timer;
+   struct update_util_data update_util;
struct cpu_common_dbs_info *shared;
 };
 
@@ -212,8 +221,7 @@ struct common_dbs_data {
 
struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
void *(*get_cpu_dbs_info_s)(int cpu);
-   unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy,
- bool modify_all);
+   unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
void (*gov_check_cpu)(int cpu, unsigned int load);
int (*init)(struct dbs_data *dbs_data, bool notify);
void (*exit)(struct dbs_data *dbs_data, bool notify);
@@ -270,9 +278,6 @@ static ssize_t show_sampling_rate_min_go
 }
 
 extern struct mutex cpufreq_governor_lock;
-
-void gov_add_timers(struct cpufreq_policy *policy, unsigned int delay);
-void gov_cancel_work(struct cpu_common_dbs_info *shared);
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
struct common_dbs_data *cdata, unsigned int event);
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -128,10 +128,10 @@ void dbs_check_cpu(struct dbs_data *dbs_
 * dropped down. So we perform the copy only once, upon the
 * first wake-up from idle.)
 *
-* Detecting this situation is easy: the governor's deferrable
-* timer would not have fired during CPU-idle periods. Hence
-* an unusually large 'wall_time' (as compared to the sampling
-* rate) indicates this scenario.
+* Detecting this situation is easy: the governor's utilization
+* update handler would not have run during CPU-idle periods.
+* Hence, an unusually large 'wall_time' (as compared to the
+* sampling rate) indicates this scenario.
 *
 * prev_load can be zero in two cases and we must recalculate it
 * for both cases:
@@ -161,72 +161,48 @@ void dbs_check_cpu(struct dbs_data *dbs_
 }
 EXPORT_SYMBOL_GPL(dbs_check_cpu);
 
-void gov_add_timers(struct cpufreq_policy *policy, unsigned int delay)
+void gov_set_update_util(struct cpu_common_dbs_info *shared,
+unsigned int delay_us)
 {
+   struct cpufreq_policy *policy = shared->policy;
struct dbs_data *dbs_data =

[PATCH v3 0/2] Fix ordering of ftrace/livepatch calls on module load and unload

2016-02-05 Thread Jessica Yu

As explained here [1], livepatch modules are failing to initialize properly
because the ftrace coming module notifier (which calls
ftrace_module_enable()) runs *after* the livepatch module notifier (which
enables the patch(es)). Thus livepatch attempts to apply patches to
modules before ftrace_module_enable() is even called for the corresponding
module(s). As a result, patch modules break. Ftrace code must run before
livepatch on module load, and the reverse is true on module unload.

For ftrace and livepatch, order of initialization (plus exit/cleanup code) is
important for loading and unloading modules, and using module notifiers to
perform this work is not ideal since it is not always clear what gets called
when. In this patchset, dependence on the module notifier call chain is removed
in favor of hard coding the corresponding function calls in the module loader.
This promotes better code visibility and ensures that ftrace and livepatch code
get called in the correct order on patch module load and unload.

Tested the changes with a test livepatch module that patches 9p and nilfs2,
and verified that the issue described in [1] is fixed.

Patches are based on linux-next.

v1 can be found here -
https://lkml.kernel.org/g/1454049827-3726-1-git-send-email-j...@redhat.com
v2 can be found here -
https://lkml.kernel.org/g/1454375856-27757-1-git-send-email-j...@redhat.com

v3:
- Fix incorrect comments
- Rename klp_module_{enable,disable} to klp_module_{coming,going}
- Remove externs from livepatch.h
- Fix error handling in kernel/module.c

v2:
- Instead of splitting the ftrace and livepatch notifiers into coming + going
  notifiers and adjusting their priorities, remove ftrace and livepatch 
notifiers
  completely and hard-code the necessary function calls in the module loader.

[1] 
http://lkml.kernel.org/g/20160128204033.ga32...@packer-debian-8-amd64.digitalocean.com

Jessica Yu (2):
  ftrace/module: remove ftrace module notifier
  livepatch/module: remove livepatch module notifier

 include/linux/ftrace.h|   6 +-
 include/linux/livepatch.h |   9 +++
 kernel/livepatch/core.c   | 153 +++---
 kernel/module.c   |  24 +++-
 kernel/trace/ftrace.c |  36 +--
 5 files changed, 112 insertions(+), 116 deletions(-)

-- 
2.4.3

[PATCH v3 1/2] ftrace/module: remove ftrace module notifier

2016-02-05 Thread Jessica Yu

Remove the ftrace module notifier in favor of directly calling
ftrace_module_enable() and ftrace_release_mod() in the module loader.
Hard-coding the function calls directly in the module loader removes
dependence on the module notifier call chain and provides better
visibility and control over what gets called when, which is important
to kernel utilities such as livepatch.

Signed-off-by: Jessica Yu 
---
 include/linux/ftrace.h |  6 --
 kernel/module.c|  4 
 kernel/trace/ftrace.c  | 36 +---
 3 files changed, 9 insertions(+), 37 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 81de712..c2b340e 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -603,6 +603,7 @@ extern int ftrace_arch_read_dyn_info(char *buf, int size);
 
 extern int skip_trace(unsigned long ip);
 extern void ftrace_module_init(struct module *mod);
+extern void ftrace_module_enable(struct module *mod);
 extern void ftrace_release_mod(struct module *mod);
 
 extern void ftrace_disable_daemon(void);
@@ -612,8 +613,9 @@ static inline int skip_trace(unsigned long ip) { return 0; }
 static inline int ftrace_force_update(void) { return 0; }
 static inline void ftrace_disable_daemon(void) { }
 static inline void ftrace_enable_daemon(void) { }
-static inline void ftrace_release_mod(struct module *mod) {}
-static inline void ftrace_module_init(struct module *mod) {}
+static inline void ftrace_module_init(struct module *mod) { }
+static inline void ftrace_module_enable(struct module *mod) { }
+static inline void ftrace_release_mod(struct module *mod) { }
 static inline __init int register_ftrace_command(struct ftrace_func_command 
*cmd)
 {
return -EINVAL;
diff --git a/kernel/module.c b/kernel/module.c
index 9537da3..794ebe8 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -984,6 +984,8 @@ SYSCALL_DEFINE2(delete_module, const char __user *, 
name_user,
mod->exit();
blocking_notifier_call_chain(_notify_list,
 MODULE_STATE_GOING, mod);
+   ftrace_release_mod(mod);
+
async_synchronize_full();
 
/* Store the name of the last unloaded module for diagnostic purposes */
@@ -3313,6 +3315,7 @@ fail:
module_put(mod);
blocking_notifier_call_chain(_notify_list,
 MODULE_STATE_GOING, mod);
+   ftrace_release_mod(mod);
free_module(mod);
wake_up_all(_wq);
return ret;
@@ -3389,6 +3392,7 @@ static int complete_formation(struct module *mod, struct 
load_info *info)
mod->state = MODULE_STATE_COMING;
mutex_unlock(_mutex);
 
+   ftrace_module_enable(mod);
blocking_notifier_call_chain(_notify_list,
 MODULE_STATE_COMING, mod);
return 0;
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index eca592f..57a6eea 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4961,7 +4961,7 @@ void ftrace_release_mod(struct module *mod)
mutex_unlock(_lock);
 }
 
-static void ftrace_module_enable(struct module *mod)
+void ftrace_module_enable(struct module *mod)
 {
struct dyn_ftrace *rec;
struct ftrace_page *pg;
@@ -5038,38 +5038,8 @@ void ftrace_module_init(struct module *mod)
ftrace_process_locs(mod, mod->ftrace_callsites,
mod->ftrace_callsites + mod->num_ftrace_callsites);
 }
-
-static int ftrace_module_notify(struct notifier_block *self,
-   unsigned long val, void *data)
-{
-   struct module *mod = data;
-
-   switch (val) {
-   case MODULE_STATE_COMING:
-   ftrace_module_enable(mod);
-   break;
-   case MODULE_STATE_GOING:
-   ftrace_release_mod(mod);
-   break;
-   default:
-   break;
-   }
-
-   return 0;
-}
-#else
-static int ftrace_module_notify(struct notifier_block *self,
-   unsigned long val, void *data)
-{
-   return 0;
-}
 #endif /* CONFIG_MODULES */
 
-struct notifier_block ftrace_module_nb = {
-   .notifier_call = ftrace_module_notify,
-   .priority = INT_MIN,/* Run after anything that can remove kprobes */
-};
-
 void __init ftrace_init(void)
 {
extern unsigned long __start_mcount_loc[];
@@ -5098,10 +5068,6 @@ void __init ftrace_init(void)
  __start_mcount_loc,
  __stop_mcount_loc);
 
-   ret = register_module_notifier(_module_nb);
-   if (ret)
-   pr_warning("Failed to register trace ftrace module exit 
notifier\n");
-
set_ftrace_early_filters();
 
return;
-- 
2.4.3

[PATCH v3 2/2] livepatch/module: remove livepatch module notifier

2016-02-05 Thread Jessica Yu

Remove the livepatch module notifier in favor of directly enabling and
disabling patches to modules in the module loader. Hard-coding the
function calls ensures that ftrace_module_enable() is run before
klp_module_coming() during module load, and that klp_module_going() is
run before ftrace_release_mod() during module unload. This way, ftrace
and livepatch code is run in the correct order during the module
load/unload sequence without dependence on the module notifier call chain.

This fixes a notifier ordering issue in which the ftrace module notifier
(and hence ftrace_module_enable()) for coming modules was being called
after klp_module_notify(), which caused livepatch modules to initialize
incorrectly.

Signed-off-by: Jessica Yu 
---
 include/linux/livepatch.h |   9 +++
 kernel/livepatch/core.c   | 153 +++---
 kernel/module.c   |  20 +-
 3 files changed, 103 insertions(+), 79 deletions(-)

diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
index a882865..bd830d5 100644
--- a/include/linux/livepatch.h
+++ b/include/linux/livepatch.h
@@ -134,6 +134,15 @@ int klp_unregister_patch(struct klp_patch *);
 int klp_enable_patch(struct klp_patch *);
 int klp_disable_patch(struct klp_patch *);
 
+/* Called from the module loader during module coming/going states */
+int klp_module_coming(struct module *mod);
+void klp_module_going(struct module *mod);
+
+#else /* !CONFIG_LIVEPATCH */
+
+static inline int klp_module_coming(struct module *mod) { return 0; }
+static inline void klp_module_going(struct module *mod) { }
+
 #endif /* CONFIG_LIVEPATCH */
 
 #endif /* _LINUX_LIVEPATCH_H_ */
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index bc2c85c..1d47f96 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -99,12 +99,12 @@ static void klp_find_object_module(struct klp_object *obj)
/*
 * We do not want to block removal of patched modules and therefore
 * we do not take a reference here. The patches are removed by
-* a going module handler instead.
+* klp_module_going() instead.
 */
mod = find_module(obj->name);
/*
-* Do not mess work of the module coming and going notifiers.
-* Note that the patch might still be needed before the going handler
+* Do not mess work of klp_module_coming() and klp_module_going().
+* Note that the patch might still be needed before klp_module_going()
 * is called. Module functions can be called even in the GOING state
 * until mod->exit() finishes. This is especially important for
 * patches that modify semantic of the functions.
@@ -866,103 +866,112 @@ int klp_register_patch(struct klp_patch *patch)
 }
 EXPORT_SYMBOL_GPL(klp_register_patch);
 
-static int klp_module_notify_coming(struct klp_patch *patch,
-struct klp_object *obj)
+int klp_module_coming(struct module *mod)
 {
-   struct module *pmod = patch->mod;
-   struct module *mod = obj->mod;
int ret;
+   struct klp_patch *patch;
+   struct klp_object *obj;
 
-   ret = klp_init_object_loaded(patch, obj);
-   if (ret) {
-   pr_warn("failed to initialize patch '%s' for module '%s' 
(%d)\n",
-   pmod->name, mod->name, ret);
-   return ret;
-   }
+   if (WARN_ON(mod->state != MODULE_STATE_COMING))
+   return -EINVAL;
 
-   if (patch->state == KLP_DISABLED)
-   return 0;
+   mutex_lock(_mutex);
+   /*
+* Each module has to know that klp_module_coming()
+* has been called. We never know what module will
+* get patched by a new patch.
+*/
+   mod->klp_alive = true;
 
-   pr_notice("applying patch '%s' to loading module '%s'\n",
- pmod->name, mod->name);
+   list_for_each_entry(patch, _patches, list) {
+   klp_for_each_object(patch, obj) {
+   if (!klp_is_module(obj) || strcmp(obj->name, mod->name))
+   continue;
 
-   ret = klp_enable_object(obj);
-   if (ret)
-   pr_warn("failed to apply patch '%s' to module '%s' (%d)\n",
-   pmod->name, mod->name, ret);
-   return ret;
-}
+   obj->mod = mod;
 
-static void klp_module_notify_going(struct klp_patch *patch,
-   struct klp_object *obj)
-{
-   struct module *pmod = patch->mod;
-   struct module *mod = obj->mod;
+   ret = klp_init_object_loaded(patch, obj);
+   if (ret) {
+   pr_warn("failed to initialize patch '%s' for 
module '%s' (%d)\n",
+   patch->mod->name, obj->mod->name, ret);
+   goto err;
+   }
 
-   if (patch->state == KLP_DISABLED)
-

[PATCH 2/2] f2fs: preallocate blocks for buffered aio writes

2016-02-05 Thread Jaegeuk Kim

This patch preallocates data blocks for buffered aio writes.
With this patch, we can avoid redundant locking and unlocking of node pages
given consecutive aio request.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/data.c  | 36 ++--
 fs/f2fs/f2fs.h  |  1 +
 include/linux/f2fs_fs.h |  2 +-
 3 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index b95e28e..5071cf3 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -571,16 +571,25 @@ ssize_t f2fs_preallocate_blocks(struct kiocb *iocb, 
struct iov_iter *from)
ssize_t ret = 0;
 
map.m_lblk = F2FS_BYTES_TO_BLK(iocb->ki_pos);
-   map.m_len = F2FS_BYTES_TO_BLK(iov_iter_count(from));
+   map.m_len = F2FS_BLK_ALIGN(iov_iter_count(from));
map.m_next_pgofs = NULL;
 
-   if (iocb->ki_flags & IOCB_DIRECT &&
-   !(f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))) {
+   if (f2fs_encrypted_inode(inode))
+   return 0;
+
+   if (iocb->ki_flags & IOCB_DIRECT) {
+   ret = f2fs_convert_inline_inode(inode);
+   if (ret)
+   return ret;
+   return f2fs_map_blocks(inode, , 1, F2FS_GET_BLOCK_PRE_DIO);
+   }
+   if (iocb->ki_pos + iov_iter_count(from) > MAX_INLINE_DATA) {
ret = f2fs_convert_inline_inode(inode);
if (ret)
return ret;
-   ret = f2fs_map_blocks(inode, , 1, F2FS_GET_BLOCK_PRE_DIO);
}
+   if (!f2fs_has_inline_data(inode))
+   return f2fs_map_blocks(inode, , 1, F2FS_GET_BLOCK_PRE_AIO);
return ret;
 }
 
@@ -647,7 +656,13 @@ next_block:
err = -EIO;
goto sync_out;
}
-   err = __allocate_data_block();
+   if (flag == F2FS_GET_BLOCK_PRE_AIO) {
+   if (blkaddr == NULL_ADDR)
+   err = reserve_new_block();
+   dn.data_blkaddr = NEW_ADDR;
+   } else {
+   err = __allocate_data_block();
+   }
if (err)
goto sync_out;
allocated = true;
@@ -679,7 +694,8 @@ next_block:
} else if ((map->m_pblk != NEW_ADDR &&
blkaddr == (map->m_pblk + ofs)) ||
(map->m_pblk == NEW_ADDR && blkaddr == NEW_ADDR) ||
-   flag == F2FS_GET_BLOCK_PRE_DIO) {
+   flag == F2FS_GET_BLOCK_PRE_DIO ||
+   flag == F2FS_GET_BLOCK_PRE_AIO) {
ofs++;
map->m_len++;
} else {
@@ -1417,6 +1433,14 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
struct extent_info ei;
int err = 0;
 
+   /*
+* we already allocated all the blocks, so we don't need to get
+* the block addresses when there is no need to fill the page.
+*/
+   if (!f2fs_has_inline_data(inode) && !f2fs_encrypted_inode(inode) &&
+   len == PAGE_CACHE_SIZE)
+   return 0;
+
if (f2fs_has_inline_data(inode) ||
(pos & PAGE_CACHE_MASK) >= i_size_read(inode)) {
f2fs_lock_op(sbi);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 4451791..f6a841b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -392,6 +392,7 @@ struct f2fs_map_blocks {
 #define F2FS_GET_BLOCK_FIEMAP  2
 #define F2FS_GET_BLOCK_BMAP3
 #define F2FS_GET_BLOCK_PRE_DIO 4
+#define F2FS_GET_BLOCK_PRE_AIO 5
 
 /*
  * i_advise uses FADVISE_XXX_BIT. We can add additional hints later.
diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
index ac80402..f43e6a0 100644
--- a/include/linux/f2fs_fs.h
+++ b/include/linux/f2fs_fs.h
@@ -21,7 +21,7 @@
 #define F2FS_BLKSIZE   4096/* support only 4KB block */
 #define F2FS_BLKSIZE_BITS  12  /* bits for F2FS_BLKSIZE */
 #define F2FS_MAX_EXTENSION 64  /* # of extension entries */
-#define F2FS_BLK_ALIGN(x)  (((x) + F2FS_BLKSIZE - 1) / F2FS_BLKSIZE)
+#define F2FS_BLK_ALIGN(x)  (((x) + F2FS_BLKSIZE - 1) >> F2FS_BLKSIZE_BITS)
 
 #define NULL_ADDR  ((block_t)0)/* used as block_t addresses */
 #define NEW_ADDR   ((block_t)-1)   /* used as block_t addresses */
-- 
2.6.3

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread Arjan van de Ven


and most of the RT guys would only tolerate a little bit of it

is there any real/practial use of going longer than 4 seconds? if there
is then yeah fixing it makes sense.
if it's just theoretical... shrug... 32 bit systems have a bunch of
other limits/differences a well.


So I'd think it would be mostly theoretical, but in my testing on a
VM, setting the timerslack for bash to 10 secs made time sleep 1 take
~10.5 seconds. So its apparently not too hard to coalesce fairly far
out (I need to spend a bit more time to verify that events really
weren't happening during that time and we're not just doing
unnecessary delays with the extra slack).


99% sure you're hitting something else;
we look pretty much only 1 ahead in the queue for timers to run to see if
they can be run, once we hit a timer that's not ready yet we stop.
your 10 second ahead is behind a whole bunch of other not-ready ones
so won't even be looked at until its close



But yea. My main concern is that if we do a consistent 64bit interface
for all arches in the /proc//timerslack_ns interface, it will
make PR_GET_TIMERSLACK return incorrect results on 32bit systems when
the slack is >= 2^32.


or we return UINT_MAX for that case. not too hard.

[PATCH 1/2] f2fs: move dio preallocation into f2fs_file_write_iter

2016-02-05 Thread Jaegeuk Kim

This patch moves preallocation code for direct IOs into f2fs_file_write_iter.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/data.c | 38 +-
 fs/f2fs/f2fs.h |  2 ++
 fs/f2fs/file.c | 22 --
 3 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 9ae43a7..b95e28e 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -564,16 +564,24 @@ alloc:
return 0;
 }
 
-static int __allocate_data_blocks(struct inode *inode, loff_t offset,
-   size_t count)
+ssize_t f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
 {
+   struct inode *inode = file_inode(iocb->ki_filp);
struct f2fs_map_blocks map;
+   ssize_t ret = 0;
 
-   map.m_lblk = F2FS_BYTES_TO_BLK(offset);
-   map.m_len = F2FS_BYTES_TO_BLK(count);
+   map.m_lblk = F2FS_BYTES_TO_BLK(iocb->ki_pos);
+   map.m_len = F2FS_BYTES_TO_BLK(iov_iter_count(from));
map.m_next_pgofs = NULL;
 
-   return f2fs_map_blocks(inode, , 1, F2FS_GET_BLOCK_DIO);
+   if (iocb->ki_flags & IOCB_DIRECT &&
+   !(f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))) {
+   ret = f2fs_convert_inline_inode(inode);
+   if (ret)
+   return ret;
+   ret = f2fs_map_blocks(inode, , 1, F2FS_GET_BLOCK_PRE_DIO);
+   }
+   return ret;
 }
 
 /*
@@ -670,7 +678,8 @@ next_block:
map->m_len = 1;
} else if ((map->m_pblk != NEW_ADDR &&
blkaddr == (map->m_pblk + ofs)) ||
-   (map->m_pblk == NEW_ADDR && blkaddr == NEW_ADDR)) {
+   (map->m_pblk == NEW_ADDR && blkaddr == NEW_ADDR) ||
+   flag == F2FS_GET_BLOCK_PRE_DIO) {
ofs++;
map->m_len++;
} else {
@@ -1614,34 +1623,21 @@ static int check_direct_IO(struct inode *inode, struct 
iov_iter *iter,
 static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
  loff_t offset)
 {
-   struct file *file = iocb->ki_filp;
-   struct address_space *mapping = file->f_mapping;
+   struct address_space *mapping = iocb->ki_filp->f_mapping;
struct inode *inode = mapping->host;
size_t count = iov_iter_count(iter);
int err;
 
-   /* we don't need to use inline_data strictly */
-   err = f2fs_convert_inline_inode(inode);
+   err = check_direct_IO(inode, iter, offset);
if (err)
return err;
 
if (f2fs_encrypted_inode(inode) && S_ISREG(inode->i_mode))
return 0;
 
-   err = check_direct_IO(inode, iter, offset);
-   if (err)
-   return err;
-
trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
 
-   if (iov_iter_rw(iter) == WRITE) {
-   err = __allocate_data_blocks(inode, offset, count);
-   if (err)
-   goto out;
-   }
-
err = blockdev_direct_IO(iocb, inode, iter, offset, get_data_block_dio);
-out:
if (err < 0 && iov_iter_rw(iter) == WRITE)
f2fs_write_failed(mapping, offset + count);
 
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 5f98236..4451791 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -391,6 +391,7 @@ struct f2fs_map_blocks {
 #define F2FS_GET_BLOCK_DIO 1
 #define F2FS_GET_BLOCK_FIEMAP  2
 #define F2FS_GET_BLOCK_BMAP3
+#define F2FS_GET_BLOCK_PRE_DIO 4
 
 /*
  * i_advise uses FADVISE_XXX_BIT. We can add additional hints later.
@@ -1905,6 +1906,7 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *);
 void set_data_blkaddr(struct dnode_of_data *);
 int reserve_new_block(struct dnode_of_data *);
 int f2fs_get_block(struct dnode_of_data *, pgoff_t);
+ssize_t f2fs_preallocate_blocks(struct kiocb *, struct iov_iter *);
 int f2fs_reserve_block(struct dnode_of_data *, pgoff_t);
 struct page *get_read_data_page(struct inode *, pgoff_t, int, bool);
 struct page *find_data_page(struct inode *, pgoff_t);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 50fa296..3ffb6c1 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1873,14 +1873,32 @@ long f2fs_ioctl(struct file *filp, unsigned int cmd, 
unsigned long arg)
 
 static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
-   struct inode *inode = file_inode(iocb->ki_filp);
+   struct file *file = iocb->ki_filp;
+   struct inode *inode = file_inode(file);
+   ssize_t ret;
 
if (f2fs_encrypted_inode(inode) &&
!f2fs_has_encryption_key(inode) &&
f2fs_get_encryption_info(inode))
return -EACCES;
 
-   return generic_file_write_iter(iocb, from);
+   mutex_lock(>i_mutex);
+   ret = generic_write_checks(iocb, from);
+   if (ret > 0) {
+   ret =

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread John Stultz

On Fri, Feb 5, 2016 at 6:15 PM, Arjan van de Ven  wrote:
> On 2/5/2016 4:51 PM, John Stultz wrote:
>>
>> Arjan/Thomas:  One curious thing I noticed here while writing some
>> documentation. The timer_slack_ns value in the task struct is a
>> unsigned long.
>>
>> So this means PR_SET_TIMERSLACK limits the maximum slack on 32 bit
>> machines to ~4 seconds. Where on 64bit machines it can be quite a bit
>> longer (unreasonably long, really :).
>
>
> originally when we created timerslack, 4 seconds was an eternity and good
> enough for everyone
> by a mile... (assumption was practical upper limit being in the 15 msec
> range)
> and most of the RT guys would only tolerate a little bit of it
>
> is there any real/practial use of going longer than 4 seconds? if there
> is then yeah fixing it makes sense.
> if it's just theoretical... shrug... 32 bit systems have a bunch of
> other limits/differences a well.

So I'd think it would be mostly theoretical, but in my testing on a
VM, setting the timerslack for bash to 10 secs made time sleep 1 take
~10.5 seconds. So its apparently not too hard to coalesce fairly far
out (I need to spend a bit more time to verify that events really
weren't happening during that time and we're not just doing
unnecessary delays with the extra slack).

But yea. My main concern is that if we do a consistent 64bit interface
for all arches in the /proc//timerslack_ns interface, it will
make PR_GET_TIMERSLACK return incorrect results on 32bit systems when
the slack is >= 2^32.

I've got a first pass of the patch done which just uses ULONG_MAX on
the respective arch, but I've got to close up for the day, so I'll see
about doing a follow on patch that makes the task timer_slack_ns value
be a u64 and extends the interface to use that as well on all arches,
and send out both on monday so folks can see which they prefer.

thanks
-john

thanks
-john

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread John Stultz

On Fri, Feb 5, 2016 at 5:56 PM, Andrew Morton  wrote:
> On Fri, 5 Feb 2016 16:51:02 -0800 John Stultz  wrote:
>> Alternatively, with the /proc/pid/timerslack_ns interface I'm working
>> on, we can make the backing storage a long long and support 64bits of
>> nanoseconds on all architectures. (But again, we can't really change
>> PR_SET/GET_TIMERSLACK, so 32bit systems might see strange values from
>> that with larger then uint slack values).
>>
>> Or I can just leave it as ULONG_MAX on all interfaces.
>>
>> Thoughts or preferences?
>
> /proc//timer_slack_us?

So the issue isn't so much in the new interface (we can have it take a
long long), but really in the existing PR_GET/SET_TIMERSLACK. I'm just
trying to figure out if following the existing oddness is the best
approach, or if we should make the new interface do a more consistent
thing, but with the result that the PR_GET_TIMERSLACK interface might
return "incorrect" values (just the lower 32bits).

thanks
-john

RE: [RESEND x2][PATCH v2] block: partition: Add partition specific uevent callbacks for partition info

2016-02-05 Thread Caizhiyong

> > Interestingly, this feature appears to already be documented in
> > Documentation/block/cmdline-partition.txt.  I wonder how that happened.
> > Maybe we used to do this but it got taken away?

This documentation talk about the partition name usages, 
my patch support get partition name from android userspace.
the mainline kernel appears does not support the 'PARTNAME' uevent, but this 
feature is very convenient.

> 
> Heh. Looks like the documentation was added not too long ago (by Cai -
> cc'ed). I suspect they had been working w/ the Android tree and
> assumed the functionality was already upstream?
> 
> > It seems bad that we don't document uevents in any organized fashion.
> > But the audience is very small and knows how to find kernel source code
> > so I guess it doesn't matter.
> >
> > Anyway, please do check that the conveniently self-adding documentation
> > is accurate and complete.
> 
> It does match the behavior this patch provides from Android. It is
> somewhat tangential to the functionality described in the
> documentation, so I'm not sure of its measure of completeness (for
> example, it doesn't talk about PARTN parameter, but again, the
> documentation is covering how to specify partition info via the boot
> cmdline, and isn't really covering the uevents - the uevent was just a
> mentioned side-effect for the partition name portion of the cmdline
> option).
> 
> thanks
> -john

Re: [PATCH V2 0/7] cpufreq: governors: Fix ABBA lockups

2016-02-05 Thread Saravana Kannan


On 02/04/2016 07:54 PM, Rafael J. Wysocki wrote:

On Thursday, February 04, 2016 07:18:32 PM Rafael J. Wysocki wrote:

On Thu, Feb 4, 2016 at 6:44 PM, Saravana Kannan  wrote:

On 02/04/2016 09:43 AM, Saravana Kannan wrote:


On 02/04/2016 03:09 AM, Viresh Kumar wrote:


On 04-02-16, 00:50, Rafael J. Wysocki wrote:


This is exactly right.  We've avoided one deadlock only to trip into
another one.

This happens because update_sampling_rate() acquires
od_dbs_cdata.mutex which is held around cpufreq_governor_exit() by
cpufreq_governor_dbs().

Worse yet, a deadlock can still happen without (the new)
dbs_data->mutex, just between s_active and od_dbs_cdata.mutex if
update_sampling_rate() runs in parallel with
cpufreq_governor_dbs()->cpufreq_governor_exit() and the latter wins
the race.

It looks like we need to drop the governor mutex before putting the
kobject in cpufreq_governor_exit().




[cut]



No no no no! Let's not open up this can of worms of queuing up the work
to handle a write to a sysfs file. It *MIGHT* work for this specific
tunable (I haven't bothered to analyze), but this makes it impossible to
return a useful/proper error value.



Sent too soon. Not only that, but it can also cause the writes to the sysfs
files to get processed in a different order and I don't know what other
issues/races THAT will open up.


Well, I don't like this too.

I actually do have an idea about how to fix these deadlocks, but it is
on top of my cleanup series.

I'll write more about it later today.


Having actually posted that series again after cleaning it up I can say
what I'm thinking about hopefully without confusing anyone too much.  So
please bear in mind that I'm going to refer to this series below:

http://marc.info/?l=linux-pm=145463901630950=4

Also this is more of a brain dump rather than actual design description,
so there may be holes etc in it.  Please let me know if you can see any.

The problem at hand is that policy->rwsem needs to be held around *all*
operations in cpufreq_set_policy().  In particular, it cannot be dropped
around invocations of __cpufreq_governor() with the event arg equal to
_EXIT as that leads to interesting races.

Unfortunately, we know that holding policy->rwsem in those places leads
to a deadlock with governor sysfs attributes removal in cpufreq_governor_exit().

Viresh attempted to fix this by avoiding to acquire policy->rwsem for governor
attributes access (as holding it is not necessary for them in principle).  That
was a nice try, but it turned out to be insufficient because of another deadlock
scenario uncovered by it.  Namely, since the ondemand governor's 
update_sampling_rate()
acquires the governor mutex (called dbs_data_mutex after my patches mentioned
above), it may deadlock with exactly the same piece of code in 
cpufreq_governor_exit()
in almost exactly the same way.

To avoid that other deadlock, we'd either need to drop dbs_data_mutex from
update_sampling_rate(), or drop it for the removal of the governor sysfs
attributes in cpufreq_governor_exit().  I don't think the former is an option
at least at this point, so it looks like we pretty much have to do the latter.

With that in mind, I'd start with the changes made by Viresh (maybe without the
first patch which really isn't essential here).  That is, introduce a separate
kobject type for the governor attributes kobject and register that in
cpufreq_governor_init().  The show/store callbacks for that kobject type won't
acquire policy->rwsem so the first deadlock will be avoided.

But in addition to that, I'd drop dbs_data_mutex before the removal of governor
sysfs attributes.  That actually happens in two places, in 
cpufreq_governor_exit()
and in the error path of cpufreq_governor_init().

To that end, I'd move the locking from cpufreq_governor_dbs() to the functions
called by it.  That should be readily doable and they can do all of the
necessary checks themselves.  cpufreq_governor_dbs() would become a pure mux 
then,
but that's not such a big deal.

With that, cpufreq_governor_exit() may just drop the lock before it does the
final kobject_put().  The danger here is that the sysfs show/store callbacks of
the governor attributes kobject may see invalid dbs_data for a while, after the
lock has been dropped and before the kobject is deleted.  That may be addressed
by checking, for example, the presence of the dbs_data's "tuners" pointer in 
those
callbacks.  If it is NULL, they can simply return -EAGAIN or similar.

Now, that means, though, that they need to acquire the same lock as
cpufreq_governor_exit(), or they may see things go away while they are running.
The simplest approach here would be to take dbs_data_mutex in them too, although
that's a bit of a sledgehammer.  It might be better to have a per-policy lock
in struct policy_dbs_info for that, for example, but then the governor attribute
sysfs callbacks would need to get that object instead of dbs_data.

On the flip side, it might be possible to

[GIT PULL] Ceph fixes for -rc3

2016-02-05 Thread Sage Weil

Hi Linus,

Please pull the follow Ceph fixes from

  git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git for-linus

We have a few wire protocol compatibility fixes, ports of a few recent 
CRUSH mapping changes, and a couple error path fixes.

Thanks!
sage



Dan Carpenter (1):
  ceph: checking for IS_ERR instead of NULL

Ilya Dryomov (6):
  crush: ensure bucket id is valid before indexing buckets array
  crush: ensure take bucket value is valid
  crush: add chooseleaf_stable tunable
  crush: decode and initialize chooseleaf_stable
  libceph: advertise support for TUNABLES5
  libceph: MOSDOpReply v7 encoding

Yan, Zheng (1):
  ceph: fix snap context leak in error path

 fs/ceph/file.c |  6 +++---
 include/linux/ceph/ceph_features.h | 16 +++-
 include/linux/crush/crush.h|  8 +++-
 net/ceph/crush/mapper.c| 33 ++---
 net/ceph/osd_client.c  | 10 ++
 net/ceph/osdmap.c  | 19 ++-
 6 files changed, 75 insertions(+), 17 deletions(-)

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread Arjan van de Ven


On 2/5/2016 4:51 PM, John Stultz wrote:

On Fri, Feb 5, 2016 at 2:35 PM, John Stultz  wrote:

On Fri, Feb 5, 2016 at 12:50 PM, Andrew Morton
 wrote:

On Fri, 5 Feb 2016 12:44:04 -0800 Kees Cook  wrote:

Could this be exposed as a writable /proc entry instead? Like the oom_* stuff?


/proc//timer_slack_ns, guarded by ptrace_may_access(), documented
under Documentation/?  Yup, that would work.  It's there for all
architectures from day one and there is precedent.  It's not as nice,
but /proc nasties will always be with us.


Ok. I'll start working on that.


Arjan/Thomas:  One curious thing I noticed here while writing some
documentation. The timer_slack_ns value in the task struct is a
unsigned long.

So this means PR_SET_TIMERSLACK limits the maximum slack on 32 bit
machines to ~4 seconds. Where on 64bit machines it can be quite a bit
longer (unreasonably long, really :).


originally when we created timerslack, 4 seconds was an eternity and good 
enough for everyone
by a mile... (assumption was practical upper limit being in the 15 msec range)
and most of the RT guys would only tolerate a little bit of it

is there any real/practial use of going longer than 4 seconds? if there
is then yeah fixing it makes sense.
if it's just theoretical... shrug... 32 bit systems have a bunch of
other limits/differences a well.

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread Andrew Morton

On Fri, 5 Feb 2016 16:51:02 -0800 John Stultz  wrote:

> On Fri, Feb 5, 2016 at 2:35 PM, John Stultz  wrote:
> > On Fri, Feb 5, 2016 at 12:50 PM, Andrew Morton
> >  wrote:
> >> On Fri, 5 Feb 2016 12:44:04 -0800 Kees Cook  wrote:
> >>> Could this be exposed as a writable /proc entry instead? Like the oom_* 
> >>> stuff?
> >>
> >> /proc//timer_slack_ns, guarded by ptrace_may_access(), documented
> >> under Documentation/?  Yup, that would work.  It's there for all
> >> architectures from day one and there is precedent.  It's not as nice,
> >> but /proc nasties will always be with us.
> >
> > Ok. I'll start working on that.
> 
> Arjan/Thomas:  One curious thing I noticed here while writing some
> documentation. The timer_slack_ns value in the task struct is a
> unsigned long.
> 
> So this means PR_SET_TIMERSLACK limits the maximum slack on 32 bit
> machines to ~4 seconds. Where on 64bit machines it can be quite a bit
> longer (unreasonably long, really :).
> 
> While 4 seconds is probably a reasonable interactivity limit, testing
> w/ 10 second slack values on a VM showed those timers pushed back to
> almost 10 seconds. So it may be useful to have > 4 second slack values
> generally. Thus left alone this seems like an unfair disadvantage to
> 32bit machines.
> 
> We can't do too much about the PR_GET_TIMERSLACK/PR_SET_TIMERSLACK
> interfaces, since its ABI and specifies a long, so one option there
> would be to make sure the value specified is capped to UINT_MAX which
> would keep the max value to ~4 seconds on all architectures.
> 
> Alternatively, with the /proc/pid/timerslack_ns interface I'm working
> on, we can make the backing storage a long long and support 64bits of
> nanoseconds on all architectures. (But again, we can't really change
> PR_SET/GET_TIMERSLACK, so 32bit systems might see strange values from
> that with larger then uint slack values).
> 
> Or I can just leave it as ULONG_MAX on all interfaces.
> 
> Thoughts or preferences?

/proc//timer_slack_us?

Re: [PATCH] devm_memremap: Fix error value when memremap failed

2016-02-05 Thread Dan Williams

On Fri, Feb 5, 2016 at 5:49 PM, Dan Williams  wrote:
> On Fri, Feb 5, 2016 at 5:40 PM, Toshi Kani  wrote:
>> devm_memremap() returns an ERR_PTR() value in case of error.
>> However, it returns NULL when memremap() failed.  This causes
>> the caller, such as the pmem driver, to proceed and oops later.
>>
>> Change devm_memremap() to return ERR_PTR(-ENXIO) when memremap()
>> failed.
>>
>> Signed-off-by: Toshi Kani 
>> Cc: Dan Williams 
>> Cc: Andrew Morton 
>
> Acked-by: Dan Williams 

Should also go to -stable, I'll add that and include this with some
other fixes I have brewing.

Re: [PATCH] devm_memremap: Fix error value when memremap failed

2016-02-05 Thread Dan Williams

On Fri, Feb 5, 2016 at 5:40 PM, Toshi Kani  wrote:
> devm_memremap() returns an ERR_PTR() value in case of error.
> However, it returns NULL when memremap() failed.  This causes
> the caller, such as the pmem driver, to proceed and oops later.
>
> Change devm_memremap() to return ERR_PTR(-ENXIO) when memremap()
> failed.
>
> Signed-off-by: Toshi Kani 
> Cc: Dan Williams 
> Cc: Andrew Morton 

Acked-by: Dan Williams

[PATCH] spi: fix spi.h kernel-doc warning

2016-02-05 Thread Randy Dunlap

From: Randy Dunlap 

Fix kernel-doc warning for missing struct field notation.

..//include/linux/spi/spi.h:540: warning: No description found for parameter 
'max_transfer_size'

Signed-off-by: Randy Dunlap 
---
 include/linux/spi/spi.h |2 ++
 1 file changed, 2 insertions(+)

--- lnx-45-rc2.orig/include/linux/spi/spi.h
+++ lnx-45-rc2/include/linux/spi/spi.h
@@ -303,6 +303,8 @@ static inline void spi_unregister_driver
  * @min_speed_hz: Lowest supported transfer speed
  * @max_speed_hz: Highest supported transfer speed
  * @flags: other constraints relevant to this driver
+ * @max_transfer_size: function that returns the max transfer size for
+ * a _device; may be %NULL, so the default %SIZE_MAX will be used.
  * @bus_lock_spinlock: spinlock for SPI bus locking
  * @bus_lock_mutex: mutex for SPI bus locking
  * @bus_lock_flag: indicates that the SPI bus is locked for exclusive use

Re: [PATCH] devm_memremap: Fix error value when memremap failed

2016-02-05 Thread Ross Zwisler

On Fri, Feb 05, 2016 at 06:40:27PM -0700, Toshi Kani wrote:
> devm_memremap() returns an ERR_PTR() value in case of error.
> However, it returns NULL when memremap() failed.  This causes
> the caller, such as the pmem driver, to proceed and oops later.
> 
> Change devm_memremap() to return ERR_PTR(-ENXIO) when memremap()
> failed.
> 
> Signed-off-by: Toshi Kani 
> Cc: Dan Williams 
> Cc: Andrew Morton 

Yep, good catch.

Reviewed-by: Ross Zwisler 

> ---
>  kernel/memremap.c |4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/memremap.c b/kernel/memremap.c
> index 70ee377..3427cca 100644
> --- a/kernel/memremap.c
> +++ b/kernel/memremap.c
> @@ -136,8 +136,10 @@ void *devm_memremap(struct device *dev, resource_size_t 
> offset,
>   if (addr) {
>   *ptr = addr;
>   devres_add(dev, ptr);
> - } else
> + } else {
>   devres_free(ptr);
> + return ERR_PTR(-ENXIO);
> + }
>  
>   return addr;
>  }
> ___
> Linux-nvdimm mailing list
> linux-nvd...@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [RFC] why the amount of cache from "free -m" and /proc/meminfo are different?

2016-02-05 Thread Xishi Qiu

On 2016/2/5 19:44, Daniel K. wrote:

> On 02/05/2016 07:50 AM, Xishi Qiu wrote:
>> [root@localhost ~]# free -m
>>   totalusedfree  shared  buff/cache   
>> available
>> Mem:  48295 574   41658   86062   
>> 46344
>> Swap: 24191   0   24191
>>
>> [root@localhost ~]# cat /proc/meminfo
>> Buffers:   0 kB
>> Cached:  3727824 kB
>> Slab:2480092 kB
> 
> free and meminfo seems to match up pretty well to me.
> 
> Are you really asking about display in MB vs kB?
> 

Hi Daniel,

No, I mean "Cached: 3727824 kB" and "buff/cache 6062M" are different.

Does "buff/cache" include Buffers, Cached, and Slab?

Thanks,
Xishi Qiu

> Drop the -m switch to free.
> 
> Also, give 'man free' a spin, it explains what's behind the numbers.
> 
> 
> Daniel K.
> 
> .
>

Re: Bisected Regression 4.3.5 => 4.4.1 booting HP ZBook in EFI mode

2016-02-05 Thread Greg Kroah-Hartman

On Fri, Feb 05, 2016 at 06:41:56PM -0500, Phil Turmel wrote:
> On 02/05/2016 05:29 PM, Greg Kroah-Hartman wrote:
> > On Fri, Feb 05, 2016 at 04:48:52PM -0500, Phil Turmel wrote:
> 
> >> I'm stumped as to how that powerpc patch can affect my x86 laptop, an
> >> HP ZBook 17 w/ i7 processor & nouveau graphics, but it certainly
> >> does.  The bisect was stable and I confirmed by reverting it on
> >> top of the intended v4.4.1.
> > 
> > That's crazy, nothing should even be rebuilt if you revert that patch,
> > so I don't see how that could affect things here.
> 
> I thought so too, but ...
> 
> > Can you verify that nothing does get rebuilt when you do this?
> 
> # git checkout v4.4.1
> 
> # make -j15
> // lots of output ///
> 
> # make
>   CHK include/config/kernel.release
>   CHK include/generated/uapi/linux/version.h
>   CHK include/generated/utsrelease.h
>   CHK include/generated/bounds.h
>   CHK include/generated/timeconst.h
>   CHK include/generated/asm-offsets.h
>   CALLscripts/checksyscalls.sh
>   CHK include/generated/compile.h
>   CHK kernel/config_data.h
> Kernel: arch/x86/boot/bzImage is ready  (#133)
>   Building modules, stage 2.
>   MODPOST 1166 modules
> 
> # git revert badc688
> // trimmed commit log ///
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> # make
>   CHK include/config/kernel.release
>   UPD include/config/kernel.release
>   CHK include/generated/uapi/linux/version.h
>   CHK include/generated/utsrelease.h
>   UPD include/generated/utsrelease.h
>   CHK include/generated/bounds.h
>   CHK include/generated/timeconst.h
>   CHK include/generated/asm-offsets.h
>   CALLscripts/checksyscalls.sh
>   CHK include/generated/compile.h
>   CC  init/version.o
>   LD  init/built-in.o
>   CC  kernel/sys.o
>   CC  kernel/trace/trace.o
>   LD  kernel/trace/built-in.o
>   CC  kernel/module.o
>   CHK kernel/config_data.h
>   LD  kernel/built-in.o
>   CC  drivers/base/firmware_class.o
>   LD  drivers/base/built-in.o
>   CC  drivers/gpu/drm/i915/i915_gpu_error.o
>   LD  drivers/gpu/drm/i915/i915.o
>   LD  drivers/gpu/drm/i915/built-in.o
>   LD  drivers/gpu/drm/built-in.o
>   LD  drivers/gpu/built-in.o
>   CC  drivers/target/target_core_configfs.o
>   LD  drivers/target/target_core_mod.o
>   LD  drivers/target/built-in.o
>   CC [M]  drivers/vhost/scsi.o
>   LD [M]  drivers/vhost/vhost_scsi.o
>   LD  drivers/built-in.o
>   LINKvmlinux
>   LD  vmlinux.o
>   MODPOST vmlinux.o
>   GEN .version
>   CHK include/generated/compile.h
>   UPD include/generated/compile.h
>   CC  init/version.o
>   LD  init/built-in.o
>   KSYM.tmp_kallsyms1.o
>   KSYM.tmp_kallsyms2.o
>   LD  vmlinux
>   SORTEX  vmlinux
>   SYSMAP  System.map
>   VOFFSET arch/x86/boot/voffset.h
>   OBJCOPY arch/x86/boot/compressed/vmlinux.bin
>   LZMAarch/x86/boot/compressed/vmlinux.bin.lzma
>   MKPIGGY arch/x86/boot/compressed/piggy.S
>   AS  arch/x86/boot/compressed/piggy.o
>   LD  arch/x86/boot/compressed/vmlinux
>   ZOFFSET arch/x86/boot/zoffset.h
>   AS  arch/x86/boot/header.o
>   CC  arch/x86/boot/version.o
>   LD  arch/x86/boot/setup.elf
>   OBJCOPY arch/x86/boot/setup.bin
>   OBJCOPY arch/x86/boot/vmlinux.bin
>   BUILD   arch/x86/boot/bzImage
> Setup is 15564 bytes (padded to 15872 bytes).
> System is 26558 kB
> CRC 29b78b83
> Kernel: arch/x86/boot/bzImage is ready  (#134)
>   Building modules, stage 2.
>   MODPOST 1166 modules
> 
> //
> So, a handful of items in the main kernel get
> rebuilt, including i915 stuff.  This laptop does
> have the intel graphics base with nvidia layered
> on top.
> 
> The build proceeded to redo all my modules, shown below.

Ah, you have versioned modules / builds enabled, that's what caused the
rebuild, if you disable CONFIG_MODVERSIONS and
CONFIG_MODULE_SRCVERSION_ALL you shouldn't rebuild everything.

If those options are disabled, then something really odd is going on
here...

thanks,

greg k-h

[PATCH] param-convert-some-on-off-users-to-strtobool fix

2016-02-05 Thread Kees Cook

This converts a missed __setup return (and silences the build warning it
was causing).

Signed-off-by: Kees Cook 
---
 arch/powerpc/kernel/rtasd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 0ae5cb84d4e2..aa610ce8742f 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -592,8 +592,6 @@ __setup("surveillance=", surveillance_setup);
 
 static int __init rtasmsgs_setup(char *str)
 {
-   kstrtobool(str, _rtas_msgs);
-
-   return 1;
+   return (kstrtobool(str, _rtas_msgs) == 0);
 }
 __setup("rtasmsgs=", rtasmsgs_setup);
-- 
2.6.3


-- 
Kees Cook
Chrome OS & Brillo Security

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread John Stultz

On Fri, Feb 5, 2016 at 2:35 PM, John Stultz  wrote:
> On Fri, Feb 5, 2016 at 12:50 PM, Andrew Morton
>  wrote:
>> On Fri, 5 Feb 2016 12:44:04 -0800 Kees Cook  wrote:
>>> Could this be exposed as a writable /proc entry instead? Like the oom_* 
>>> stuff?
>>
>> /proc//timer_slack_ns, guarded by ptrace_may_access(), documented
>> under Documentation/?  Yup, that would work.  It's there for all
>> architectures from day one and there is precedent.  It's not as nice,
>> but /proc nasties will always be with us.
>
> Ok. I'll start working on that.

Arjan/Thomas:  One curious thing I noticed here while writing some
documentation. The timer_slack_ns value in the task struct is a
unsigned long.

So this means PR_SET_TIMERSLACK limits the maximum slack on 32 bit
machines to ~4 seconds. Where on 64bit machines it can be quite a bit
longer (unreasonably long, really :).

While 4 seconds is probably a reasonable interactivity limit, testing
w/ 10 second slack values on a VM showed those timers pushed back to
almost 10 seconds. So it may be useful to have > 4 second slack values
generally. Thus left alone this seems like an unfair disadvantage to
32bit machines.

We can't do too much about the PR_GET_TIMERSLACK/PR_SET_TIMERSLACK
interfaces, since its ABI and specifies a long, so one option there
would be to make sure the value specified is capped to UINT_MAX which
would keep the max value to ~4 seconds on all architectures.

Alternatively, with the /proc/pid/timerslack_ns interface I'm working
on, we can make the backing storage a long long and support 64bits of
nanoseconds on all architectures. (But again, we can't really change
PR_SET/GET_TIMERSLACK, so 32bit systems might see strange values from
that with larger then uint slack values).

Or I can just leave it as ULONG_MAX on all interfaces.

Thoughts or preferences?

thanks
-john

[git pull] drm fixes

2016-02-05 Thread Dave Airlie


Hi Linus,

Fixes all over the place:

amdkfd: two static checker fixes
mst: a bunch of static checker and spec/hw interaction fixes
amdgpu: fix Iceland hw properly, and some fiji bugs, along with
some write-combining fixes.
exynos: some regression fixes
adv7511: fix some EDID reading issues.

Dave.

The following changes since commit 36f90b0a2ddd60823fe193a85e60ff1906c2a9b3:

  Linux 4.5-rc2 (2016-01-31 18:12:16 -0800)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 6739b3d7bc18a5373efd863b11831e8f515fffe1:

  Merge branch 'drm-fixes-mst' of git://people.freedesktop.org/~airlied/linux 
into drm-fixes (2016-02-05 15:24:17 +1000)


Alex Deucher (10):
  drm/amdgpu: no need to load MC firmware on fiji
  drm/amdgpu/gfx8: enable cp inst/reg error interrupts
  drm/amdgpu/gfx7: enable cp inst/reg error interrupts
  drm/amdgpu: move gmc7 support out of CIK dependency
  drm/amdgpu: pull topaz gmc bits into gmc_v7
  drm/amdgpu: drop topaz support from gmc8 module
  drm/amdgpu: don't load MEC2 on topaz
  drm/amdgpu: load MEC ucode manually on iceland
  drm/amdgpu: remove exp hardware support from iceland
  drm/amdgpu: disable uvd and vce clockgating on Fiji

Amitoj Kaur Chawla (1):
  drm/amdkfd: Remove unnecessary cast in kfree

Andreas Ziegler (1):
  drm/i915: Remove select to deleted STOP_MACHINE from Kconfig

Andrey Grodzovsky (1):
  drm/dp/mst: Reverse order of MST enable and clearing VC payload table.

Arnd Bergmann (2):
  drm/exynos: fix building without CONFIG_PM_SLEEP
  drm: exynos: make PM functions as __maybe_unused

Colin Ian King (1):
  drm/amdgpu: fix non-ANSI declaration of 
amdgpu_amdkfd_gfx_*_get_functions()

Dave Airlie (7):
  drm: add helper to check for wc memory support
  Merge tag 'drm-intel-fixes-2016-02-04' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes
  Merge branch 'drm/adv7511' of git://git.kernel.org/.../wsa/linux into 
drm-fixes
  Merge branch 'exynos-drm-fixes' of 
git://git.kernel.org:/.../daeinki/drm-exynos into drm-fixes
  Merge tag 'drm-amdkfd-fixes-2016-01-28' of 
git://people.freedesktop.org/~gabbayo/linux into drm-fixes
  Merge branch 'drm-fixes-4.5' of git://people.freedesktop.org/~agd5f/linux 
into drm-fixes
  Merge branch 'drm-fixes-mst' of 
git://people.freedesktop.org/~airlied/linux into drm-fixes

Francisco Jerez (1):
  drm/i915: Make sure DC writes are coherent on flush.

Gerd Hoffmann (1):
  drm/i915: refine qemu south bridge detection

Harry Wentland (2):
  drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
  drm/dp/mst: Calculate MST PBN with 31.32 fixed point

Hersen Wu (1):
  drm/dp/mst: move GUID storage from mgr, port to only mst branch

Imre Deak (2):
  drm/mst: Don't ignore the MST PBN self-test result
  drm/mst: Add range check for max_payloads during init

Insu Yun (1):
  drm: fix missing reference counting decrease

Jani Nikula (1):
  drm/i915/dp: fall back to 18 bpp when sink capability is unknown

Javier Martinez Canillas (1):
  drm/exynos: dp: Fix panel and bridge lookup logic

Ken Wang (2):
  drm/amdgpu: iceland use CI based MC IP
  drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above

Mykola Lysenko (2):
  drm/dp/mst: change MST detection scheme
  drm/dp/mst: deallocate payload on port destruction

Oded Gabbay (2):
  drm/radeon: mask out WC from BO on unsupported arches
  drm/amdgpu: mask out WC from BO on unsupported arches

Ville Syrjälä (2):
  drm/i915: Don't reject primary plane windowing with color keying enabled 
on SKL+
  drm/i915: Fix NULL plane->fb oops on SKL

Wolfram Sang (3):
  drm: adv7511: really enable interrupts for EDID detection
  drm: adv7511: mark ADV7511_REG_EDID_READ_CTRL volatile
  drm: adv7511: it's HPD, not HDP

 drivers/gpu/drm/amd/amdgpu/Makefile   |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|   8 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c |  20 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |  28 ++-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c |  43 +++-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c |  30 +--
 drivers/gpu/drm/amd/amdgpu/iceland_smc.c  |  12 +-
 drivers/gpu/drm/amd/amdgpu/vi.c   |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |   2 +-
 drivers/gpu/drm/drm_dp_mst_topology.c | 279 ++
 drivers/gpu/drm/exynos/exynos_dp_core.c   |  55 ++---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c   |   6 +-

[PATCH] devm_memremap: Fix error value when memremap failed

2016-02-05 Thread Toshi Kani

devm_memremap() returns an ERR_PTR() value in case of error.
However, it returns NULL when memremap() failed.  This causes
the caller, such as the pmem driver, to proceed and oops later.

Change devm_memremap() to return ERR_PTR(-ENXIO) when memremap()
failed.

Signed-off-by: Toshi Kani 
Cc: Dan Williams 
Cc: Andrew Morton 
---
 kernel/memremap.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 70ee377..3427cca 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -136,8 +136,10 @@ void *devm_memremap(struct device *dev, resource_size_t 
offset,
if (addr) {
*ptr = addr;
devres_add(dev, ptr);
-   } else
+   } else {
devres_free(ptr);
+   return ERR_PTR(-ENXIO);
+   }

return addr;
 }

[PATCH] kbuild: disable Android-specific compiler features

2016-02-05 Thread Kees Cook

The Android compilers enable some non-standard features by default. While
most Android build systems inject the needed "-mno-android" option via
KCFLAGS, it happens too late (at least on x86_64), since KBUILD_CFLAGS
gains KCFLAGS after running (and failing) many cc-option tests. (For
example, the stack-protector tests happen after arch-specific
KBUILD_CFLAGS are added but before the external KCFLAGS are added.) As
such, we should notice this option and immediately turn it on as the
first cc-option test we run.

Signed-off-by: Kees Cook 
---
 Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/Makefile b/Makefile
index 6c1a3c247988..126c98b582bb 100644
--- a/Makefile
+++ b/Makefile
@@ -393,6 +393,10 @@ KBUILD_CFLAGS   := -Wall -Wundef -Wstrict-prototypes 
-Wno-trigraphs \
   -Wno-format-security \
   -std=gnu89
 
+# We must turn off the Android-specific compiler options as early as possible
+# otherwise cc-option calls below may erroneously fail.
+KBUILD_CFLAGS  += $(call cc-option,-mno-android,)
+
 KBUILD_AFLAGS_KERNEL :=
 KBUILD_CFLAGS_KERNEL :=
 KBUILD_AFLAGS   := -D__ASSEMBLY__
-- 
2.6.3


-- 
Kees Cook
Chrome OS & Brillo Security

Re: [PATCH v4] iio: adc: Add TI ADS1015 ADC driver support

2016-02-05 Thread Michael Welling

On Fri, Feb 05, 2016 at 09:32:34PM +0200, Daniel Baluta wrote:
> >> +static int ads1015_read_raw(struct iio_dev *indio_dev,
> >> + struct iio_chan_spec const *chan, int *val,
> >> + int *val2, long mask)
> >> +{
> >> + int ret, idx;
> >> + struct ads1015_data *data = iio_priv(indio_dev);
> >> +
> >> + mutex_lock(>lock);
> >> + switch (mask) {
> >> + case IIO_CHAN_INFO_RAW:
> >> + if (iio_buffer_enabled(indio_dev)) {
> >> + ret = -EBUSY;
> >> + break;
> >> + }
> >> +
> >> + ret = ads1015_set_power_state(data, true);
> >> + if (ret < 0)
> >> + break;
> >
> > Just tested the driver on a Dragonboard 410C with a robotics mezzanine that 
> > I
> > designed.
> >
> > The above ads1015_set_power_state(data, true) is always returning -EINVAL.
> >
> > Any ideas why that would be happening?
> > I think it may be the return from pm_runtime_get_sync?
> 
> Can you confirm that pm_runtime_get_sync fails? Using some printk?
> 
> Also adding printks in suspend/resume function would be helpful. Do
> you have CONFIG_PM enabled?
>

Indeed it is the pm_runtime_get_sync that fails with a -EINVAL.

> >
> > When I comment out the break the readings come back but are not updated 
> > continually.
> > If I read in_voltage0-voltage1_raw then in_voltage0_raw the value is 
> > updated.
> 
> I guess this is normal if set_power_state fails.

The hwmod driver works fine BTW.

My guess is there is an issue with the qup i2c driver seeing as it has worked on
other system without issue.

CC'd some the latest developer on the qup i2c driver.

I2C guys have any ideas on this?

> 
> thanks,
> Daniel.

mmotm 2016-02-05-16-31 uploaded

2016-02-05 Thread akpm

The mm-of-the-moment snapshot 2016-02-05-16-31 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (4.x
or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/

and use of this tree is similar to
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.


This mmotm tree contains the following patches against 4.5-rc2:
(patches marked "*" will be included in linux-next)

  origin.patch
* signals-work-around-random-wakeups-in-sigsuspend.patch
* block-fix-pfn_mkwrite-dax-fault-handler.patch
* m32r-fix-build-failure-due-to-smp-and-mmu.patch
* mm-validate_mm-browse_rb-smp-race-condition.patch
* dump_stack-avoid-potential-deadlocks.patch
* memblock-dont-mark-memblock_phys_mem_size-as-__init.patch
* mm-kconfig-correct-description-of-deferred_struct_page_init.patch
* mm-vmstat-make-quiet_vmstat-lighter.patch
* vmstat-make-vmstat_update-deferrable.patch
* 
mm-vmstat-fix-wrong-wq-sleep-when-memory-reclaim-doesnt-make-any-progress.patch
* mempolicy-do-not-try-to-queue-pages-from-vma_migratable.patch
* mm-downgrade-vm_bug-in-isolate_lru_page-to-warning.patch
* mm-hugetlb-fix-gigantic-page-initialization-allocation.patch
* mm-hugetlb-dont-require-cma-for-runtime-gigantic-pages.patch
* um-asm-pageh-remove-the-pte_high-member-from-struct-pte_t.patch
* 
ocfs2-dlm-clear-refmap-bit-of-recovery-lock-while-doing-local-recovery-cleanup.patch
* mm-replace-vma_lock_anon_vma-with-anon_vma_lock_read-write.patch
* thp-get-deferred_split_scan-work-again.patch
* dax-dirty-inode-only-if-required.patch
* maintainers-trim-the-file-triggers-for-abi-api.patch
* radix-tree-fix-oops-after-radix_tree_iter_retry.patch
* epoll-restrict-epollexclusive-to-pollin-and-pollout.patch
  i-need-old-gcc.patch
  arch-alpha-kernel-systblss-remove-debug-check.patch
  drivers-gpu-drm-i915-intel_spritec-fix-build.patch
  drivers-gpu-drm-i915-intel_tvc-fix-build.patch
  arm-mm-do-not-use-virt_to_idmap-for-nommu-systems.patch
* ipc-shm-handle-removed-segments-gracefully-in-shm_mmap.patch
* kernel-locking-lockdepc-convert-hash-tables-to-hlists.patch
* kernel-locking-lockdepc-convert-hash-tables-to-hlists-fix.patch
* mm-slab-free-kmem_cache_node-after-destroy-sysfs-file.patch
* arm-arch-arm-include-asm-pageh-needs-personalityh.patch
* m32r-mm-fix-build-warning.patch
* fs-ext4-fsyncc-generic_file_fsync-call-based-on-barrier-flag.patch
* ocfs2-cluster-replace-the-interrupt-safe-spinlocks-with-common-ones.patch
* ocfs2-use-spinlock-irqsave-for-downconvert-lock-in-ocfs2_osb_dump.patch
* ocfs2-dlm-fix-a-typo-in-dlmcommonh.patch
* ocfs2-dlm-add-deref_done-message.patch
* 
ocfs2-dlm-return-in-progress-if-master-can-not-clear-the-refmap-bit-right-now.patch
* ocfs2-dlm-clear-dropping_ref-flag-when-the-master-goes-down.patch
* 
ocfs2-dlm-return-einval-when-the-lockres-on-migration-target-is-in-dropping_ref-state.patch
* ocfs2-add-ocfs2_write_type_t-type-to-identify-the-caller-of-write.patch
*

[PATCH] ubsan: cosmetic fix to Kconfig text

2016-02-05 Thread Yang Shi

When enabling UBSAN_SANITIZE_ALL, the kernel image size gets increased
significantly (~3x). So, it sounds better to have some note in Kconfig.

And, fixed a typo.

Signed-off-by: Yang Shi 
---
 lib/Kconfig.ubsan | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig.ubsan b/lib/Kconfig.ubsan
index 49518fb..e07c1ba 100644
--- a/lib/Kconfig.ubsan
+++ b/lib/Kconfig.ubsan
@@ -18,6 +18,8 @@ config UBSAN_SANITIZE_ALL
  This option activates instrumentation for the entire kernel.
  If you don't enable this option, you have to explicitly specify
  UBSAN_SANITIZE := y for the files/directories you want to check for 
UB.
+ Enabling this option will get kernel image size increased
+ significantly.
 
 config UBSAN_ALIGNMENT
bool "Enable checking of pointers alignment"
@@ -25,5 +27,5 @@ config UBSAN_ALIGNMENT
default y if !HAVE_EFFICIENT_UNALIGNED_ACCESS
help
  This option enables detection of unaligned memory accesses.
- Enabling this option on architectures that support unalligned
+ Enabling this option on architectures that support unaligned
  accesses may produce a lot of false positives.
-- 
2.0.2

[PATCHv4 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC

2016-02-05 Thread Laura Abbott



ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
pages for debugging purposes. This requires memory be mapped
with PAGE_SIZE mappings since breaking down larger mappings
at runtime will lead to TLB conflicts. Check if debug_pagealloc
is enabled at runtime and if so, map everyting with PAGE_SIZE
pages. Implement the functions to actually map/unmap the
pages at runtime.

Reviewed-by: Ard Biesheuvel 
Reviewed-by: Mark Rutland 
Tested-by: Mark Rutland 
Signed-off-by: Laura Abbott 
---
 arch/arm64/Kconfig   |  3 +++
 arch/arm64/mm/mmu.c  | 19 +--
 arch/arm64/mm/pageattr.c | 46 --
 3 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc6228..0f33218 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -537,6 +537,9 @@ config HOTPLUG_CPU
 source kernel/Kconfig.preempt
 source kernel/Kconfig.hz
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+   def_bool y
+
 config ARCH_HAS_HOLES_MEMORYMODEL
def_bool y if SPARSEMEM
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ef0d66c..29dcc83 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -149,6 +149,19 @@ static void split_pud(pud_t *old_pud, pmd_t *pmd)
} while (pmd++, i++, i < PTRS_PER_PMD);
 }
 
+bool block_mappings_allowed(phys_addr_t (*pgtable_alloc)(void))
+{
+
+   /*
+* If debug_page_alloc is enabled we must map the linear map
+* using pages. However, other mappings created by
+* create_mapping_noalloc must use sections in some cases. Allow
+* sections to be used in those cases, where no pgtable_alloc
+* function is provided.
+*/
+   return !pgtable_alloc || !debug_pagealloc_enabled();
+}
+
 static void alloc_init_pmd(pud_t *pud, unsigned long addr, unsigned long end,
  phys_addr_t phys, pgprot_t prot,
  phys_addr_t (*pgtable_alloc)(void))
@@ -181,7 +194,8 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
do {
next = pmd_addr_end(addr, end);
/* try section mapping first */
-   if (((addr | next | phys) & ~SECTION_MASK) == 0) {
+   if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
+ block_mappings_allowed(pgtable_alloc)) {
pmd_t old_pmd =*pmd;
set_pmd(pmd, __pmd(phys |
   pgprot_val(mk_sect_prot(prot;
@@ -241,7 +255,8 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, 
unsigned long end,
/*
 * For 4K granule only, attempt to put down a 1GB block
 */
-   if (use_1G_block(addr, next, phys)) {
+   if (use_1G_block(addr, next, phys) &&
+   block_mappings_allowed(pgtable_alloc)) {
pud_t old_pud = *pud;
set_pud(pud, __pud(phys |
   pgprot_val(mk_sect_prot(prot;
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1360a02..7ba87c4 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -37,14 +37,31 @@ static int change_page_range(pte_t *ptep, pgtable_t token, 
unsigned long addr,
return 0;
 }
 
+/*
+ * This function assumes that the range is mapped with PAGE_SIZE pages.
+ */
+static int __change_memory_common(unsigned long start, unsigned long size,
+   pgprot_t set_mask, pgprot_t clear_mask)
+{
+   struct page_change_data data;
+   int ret;
+
+   data.set_mask = set_mask;
+   data.clear_mask = clear_mask;
+
+   ret = apply_to_page_range(_mm, start, size, change_page_range,
+   );
+
+   flush_tlb_kernel_range(start, start + size);
+   return ret;
+}
+
 static int change_memory_common(unsigned long addr, int numpages,
pgprot_t set_mask, pgprot_t clear_mask)
 {
unsigned long start = addr;
unsigned long size = PAGE_SIZE*numpages;
unsigned long end = start + size;
-   int ret;
-   struct page_change_data data;
struct vm_struct *area;
 
if (!PAGE_ALIGNED(addr)) {
@@ -72,14 +89,7 @@ static int change_memory_common(unsigned long addr, int 
numpages,
!(area->flags & VM_ALLOC))
return -EINVAL;
 
-   data.set_mask = set_mask;
-   data.clear_mask = clear_mask;
-
-   ret = apply_to_page_range(_mm, start, size, change_page_range,
-   );
-
-   flush_tlb_kernel_range(start, end);
-   return ret;
+   return __change_memory_common(start, size, set_mask, clear_mask);
 }
 
 int set_memory_ro(unsigned long addr, int numpages)
@@ -111,3 +121,19 @@ int set_memory_x(unsigned long addr, int numpages)

[PATCHv4 3/3] arm64: ptdump: Indicate whether memory should be faulting

2016-02-05 Thread Laura Abbott



With CONFIG_DEBUG_PAGEALLOC, pages do not have the valid bit
set when free in the buddy allocator. Add an indiciation to
the page table dumping code that the valid bit is not set,
'F' for fault, to make this easier to understand.

Reviewed-by: Ard Biesheuvel 
Reviewed-by: Mark Rutland 
Tested-by: Mark Rutland 
Signed-off-by: Laura Abbott 
---
 arch/arm64/mm/dump.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 5a22a11..f381ac9 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -90,6 +90,11 @@ struct prot_bits {
 
 static const struct prot_bits pte_bits[] = {
{
+   .mask   = PTE_VALID,
+   .val= PTE_VALID,
+   .set= " ",
+   .clear  = "F",
+   }, {
.mask   = PTE_USER,
.val= PTE_USER,
.set= "USR",
-- 
2.5.0

[PATCHv4 1/3] arm64: Drop alloc function from create_mapping

2016-02-05 Thread Laura Abbott



create_mapping is only used in fixmap_remap_fdt. All the create_mapping
calls need to happen on existing translation table pages without
additional allocations. Rather than have an alloc function be called
and fail, just set it to NULL and catch its use. Also change
the name to create_mapping_noalloc to better capture what exactly is
going on.

Reviewed-by: Ard Biesheuvel 
Reviewed-by: Mark Rutland 
Signed-off-by: Laura Abbott 
---
 arch/arm64/mm/mmu.c | 29 -
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 7711554..ef0d66c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -116,7 +116,9 @@ static void alloc_init_pte(pmd_t *pmd, unsigned long addr,
pte_t *pte;
 
if (pmd_none(*pmd) || pmd_sect(*pmd)) {
-   phys_addr_t pte_phys = pgtable_alloc();
+   phys_addr_t pte_phys;
+   BUG_ON(!pgtable_alloc);
+   pte_phys = pgtable_alloc();
pte = pte_set_fixmap(pte_phys);
if (pmd_sect(*pmd))
split_pmd(pmd, pte);
@@ -158,7 +160,9 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
 * Check for initial section mappings in the pgd/pud and remove them.
 */
if (pud_none(*pud) || pud_sect(*pud)) {
-   phys_addr_t pmd_phys = pgtable_alloc();
+   phys_addr_t pmd_phys;
+   BUG_ON(!pgtable_alloc);
+   pmd_phys = pgtable_alloc();
pmd = pmd_set_fixmap(pmd_phys);
if (pud_sect(*pud)) {
/*
@@ -223,7 +227,9 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, 
unsigned long end,
unsigned long next;
 
if (pgd_none(*pgd)) {
-   phys_addr_t pud_phys = pgtable_alloc();
+   phys_addr_t pud_phys;
+   BUG_ON(!pgtable_alloc);
+   pud_phys = pgtable_alloc();
__pgd_populate(pgd, pud_phys, PUD_TYPE_TABLE);
}
BUG_ON(pgd_bad(*pgd));
@@ -312,7 +318,12 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t 
phys,
init_pgd(pgd_offset_raw(pgdir, virt), phys, virt, size, prot, alloc);
 }
 
-static void __init create_mapping(phys_addr_t phys, unsigned long virt,
+/*
+ * This function can only be used to modify existing table entries,
+ * without allocating new levels of table. Note that this permits the
+ * creation of new section or page entries.
+ */
+static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
  phys_addr_t size, pgprot_t prot)
 {
if (virt < VMALLOC_START) {
@@ -321,7 +332,7 @@ static void __init create_mapping(phys_addr_t phys, 
unsigned long virt,
return;
}
__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
-early_pgtable_alloc);
+NULL);
 }
 
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
@@ -680,7 +691,7 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
/*
 * Make sure that the FDT region can be mapped without the need to
 * allocate additional translation table pages, so that it is safe
-* to call create_mapping() this early.
+* to call create_mapping_noalloc() this early.
 *
 * On 64k pages, the FDT will be mapped using PTEs, so we need to
 * be in the same PMD as the rest of the fixmap.
@@ -696,8 +707,8 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
dt_virt = (void *)dt_virt_base + offset;
 
/* map the first chunk so we can read the size from the header */
-   create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), dt_virt_base,
-  SWAPPER_BLOCK_SIZE, prot);
+   create_mapping_noalloc(round_down(dt_phys, SWAPPER_BLOCK_SIZE),
+   dt_virt_base, SWAPPER_BLOCK_SIZE, prot);
 
if (fdt_check_header(dt_virt) != 0)
return NULL;
@@ -707,7 +718,7 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
return NULL;
 
if (offset + size > SWAPPER_BLOCK_SIZE)
-   create_mapping(round_down(dt_phys, SWAPPER_BLOCK_SIZE), 
dt_virt_base,
+   create_mapping_noalloc(round_down(dt_phys, SWAPPER_BLOCK_SIZE), 
dt_virt_base,
   round_up(offset + size, SWAPPER_BLOCK_SIZE), 
prot);
 
memblock_reserve(dt_phys, size);
-- 
2.5.0

[PATCHv4 0/3] ARCH_SUPPORTS_DEBUG_PAGEALLOC for arm64

2016-02-05 Thread Laura Abbott

Hi,

This is hopefully the last update to add proper ARCH_SUPPORTS_DEBUG_PAGEALLOC
support for arm64.

Changes since v3:
- More acks
- Pulled out block check into a separate function
- Style fixups
- Comment tweaking

Laura Abbott (3):
  arm64: Drop alloc function from create_mapping
  arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC
  arm64: ptdump: Indicate whether memory should be faulting

 arch/arm64/Kconfig   |  3 +++
 arch/arm64/mm/dump.c |  5 +
 arch/arm64/mm/mmu.c  | 48 +---
 arch/arm64/mm/pageattr.c | 46 --
 4 files changed, 81 insertions(+), 21 deletions(-)

-- 
2.5.0

[PATCH] arm64: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL

2016-02-05 Thread Yang Shi

To enable UBSAN on arm64, ARCH_HAS_UBSAN_SANITIZE_ALL need to be selected.

Basic kernel bootup test is passed on arm64 with CONFIG_UBSAN_SANITIZE_ALL
enabled.

Signed-off-by: Yang Shi 
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc6228..1c29e20 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -14,6 +14,7 @@ config ARM64
select ARCH_WANT_OPTIONAL_GPIOLIB
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
select ARCH_WANT_FRAME_POINTERS
+   select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARM_AMBA
select ARM_ARCH_TIMER
select ARM_GIC
-- 
2.0.2

[ANNOUNCE] Git v2.7.1

2016-02-05 Thread Junio C Hamano

The latest maintenance release Git v2.7.1 is now available at
the usual places.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/

The following public repositories all have a copy of the 'v2.7.1'
tag and the 'maint' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = git://git.sourceforge.jp/gitroot/git-core/git.git
  url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
  url = https://github.com/gitster/git



Git v2.7.1 Release Notes


Fixes since v2.7


 * An earlier change in 2.5.x-era broke users' hooks and aliases by
   exporting GIT_WORK_TREE to point at the root of the working tree,
   interfering when they tried to use a different working tree without
   setting GIT_WORK_TREE environment themselves.

 * The "exclude_list" structure has the usual "alloc, nr" pair of
   fields to be used by ALLOC_GROW(), but clear_exclude_list() forgot
   to reset 'alloc' to 0 when it cleared 'nr' to discard the managed
   array.

 * "git send-email" was confused by escaped quotes stored in the alias
   files saved by "mutt", which has been corrected.

 * A few unportable C construct have been spotted by clang compiler
   and have been fixed.

 * The documentation has been updated to hint the connection between
   the '--signoff' option and DCO.

 * "git reflog" incorrectly assumed that all objects that used to be
   at the tip of a ref must be commits, which caused it to segfault.

 * The ignore mechanism saw a few regressions around untracked file
   listing and sparse checkout selection areas in 2.7.0; the change
   that is responsible for the regression has been reverted.

 * Some codepaths used fopen(3) when opening a fixed path in $GIT_DIR
   (e.g. COMMIT_EDITMSG) that is meant to be left after the command is
   done.  This however did not work well if the repository is set to
   be shared with core.sharedRepository and the umask of the previous
   user is tighter.  They have been made to work better by calling
   unlink(2) and retrying after fopen(3) fails with EPERM.

 * Asking gitweb for a nonexistent commit left a warning in the server
   log.

 * "git rebase", unlike all other callers of "gc --auto", did not
   ignore the exit code from "gc --auto".

 * Many codepaths that run "gc --auto" before exiting kept packfiles
   mapped and left the file descriptors to them open, which was not
   friendly to systems that cannot remove files that are open.  They
   now close the packs before doing so.

 * A recent optimization to filter-branch in v2.7.0 introduced a
   regression when --prune-empty filter is used, which has been
   corrected.

 * The description for SANITY prerequisite the test suite uses has
   been clarified both in the comment and in the implementation.

 * "git tag" started listing a tag "foo" as "tags/foo" when a branch
   named "foo" exists in the same repository; remove this unnecessary
   disambiguation, which is a regression introduced in v2.7.0.

 * The way "git svn" uses auth parameter was broken by Subversion
   1.9.0 and later.

 * The "split" subcommand of "git subtree" (in contrib/) incorrectly
   skipped merges when it shouldn't, which was corrected.

 * A few options of "git diff" did not work well when the command was
   run from a subdirectory.

 * dirname() emulation has been added, as Msys2 lacks it.

 * The underlying machinery used by "ls-files -o" and other commands
   have been taught not to create empty submodule ref cache for a
   directory that is not a submodule.  This removes a ton of wasted
   CPU cycles.

 * Drop a few old "todo" items by deciding that the change one of them
   suggests is not such a good idea, and doing the change the other
   one suggested to do.

 * Documentation for "git fetch --depth" has been updated for clarity.

 * The command line completion learned a handful of additional options
   and command specific syntax.

Also includes a handful of documentation and test updates.



Changes since v2.7.0 are as follows:

Changwoo Ryu (1):
  l10n: ko.po: Add Korean translation

Dave Ware (1):
  contrib/subtree: fix "subtree split" skipped-merge bug

David A. Wheeler (1):
  Expand documentation describing --signoff

Dennis Kaarsemaker (1):
  reflog-walk: don't segfault on non-commit sha1's in the reflog

Eric Wong (3):
  git-send-email: do not double-escape quotes from mutt
  for-each-ref: document `creatordate` and `creator` fields
  git-svn: fix auth parameter handling on SVN 1.9.0+

Jeff King (8):
  avoid shifting signed integers 31 bits
  bswap: add NO_UNALIGNED_LOADS define
  rebase: ignore failures from "gc --auto"
  filter-branch: resolve $commit^{tree} in no-index case
  clean: make is_git_repository a public

Re: [PATCHv2 2/3] arm64: Add support for ARCH_SUPPORTS_DEBUG_PAGEALLOC

2016-02-05 Thread Laura Abbott


On 02/05/2016 06:20 AM, Mark Rutland wrote:

On Thu, Feb 04, 2016 at 11:43:36AM -0800, Laura Abbott wrote:


ARCH_SUPPORTS_DEBUG_PAGEALLOC provides a hook to map and unmap
pages for debugging purposes. This requires memory be mapped
with PAGE_SIZE mappings since breaking down larger mappings
at runtime will lead to TLB conflicts. Check if debug_pagealloc
is enabled at runtime and if so, map everyting with PAGE_SIZE
pages. Implement the functions to actually map/unmap the
pages at runtime.

Signed-off-by: Laura Abbott 


I've given this a spin on Juno, with and without the config option
selected, and with and without the command line option. I've also given
it a spin on Seattle with inline KASAN also enabled.

I wasn't sure how to deliberately trigger a failure, but those all
booted fine, and the dumepd page tables looks right, so FWIW:



I wrote a test that does a write after free on an allocated page
for my testing. Might be worth it to look into adding a test to
the lkdtm module.

I also did testing with 64K pages but couldn't test on 16K due
to lack of hardware and QEMU running off into the weeds.
 

Tested-by: Mark Rutland 

I have a few minor comments below, and with those fixed up:

Reviewed-by: Mark Rutland 


---
  arch/arm64/Kconfig   |  3 +++
  arch/arm64/mm/mmu.c  | 25 +
  arch/arm64/mm/pageattr.c | 46 --
  3 files changed, 60 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc6228..0f33218 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -537,6 +537,9 @@ config HOTPLUG_CPU
  source kernel/Kconfig.preempt
  source kernel/Kconfig.hz

+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+   def_bool y
+
  config ARCH_HAS_HOLES_MEMORYMODEL
def_bool y if SPARSEMEM

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ef0d66c..be81a59 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -180,8 +180,14 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
pmd = pmd_set_fixmap_offset(pud, addr);
do {
next = pmd_addr_end(addr, end);
-   /* try section mapping first */
-   if (((addr | next | phys) & ~SECTION_MASK) == 0) {
+   /*
+* try a section mapping first
+*
+* See comment in use_1G_block for why we need the check
+* for !pgtable_alloc with !debug_pagealloc
+*/
+   if (((addr | next | phys) & ~SECTION_MASK) == 0 &&
+ (!debug_pagealloc_enabled() || !pgtable_alloc)) {
pmd_t old_pmd =*pmd;
set_pmd(pmd, __pmd(phys |
   pgprot_val(mk_sect_prot(prot;
@@ -208,8 +214,19 @@ static void alloc_init_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
  }

  static inline bool use_1G_block(unsigned long addr, unsigned long next,
-   unsigned long phys)
+   unsigned long phys, phys_addr_t (*pgtable_alloc)(void))
  {
+   /*
+* If debug_page_alloc is enabled we don't want to be using sections
+* since everything needs to be mapped with pages. The catch is
+* that we only want to force pages if we can allocate the next
+* layer of page tables. If there is no pgtable_alloc function,
+* it's too early to allocate another layer and we should use
+* section mappings.
+*/


I'm not sure this quite captures the rationale, as we only care about
the linear map using pages (AFAIK), and the earliness only matters
w.r.t. the DTB mapping. How about:

/*
 * If debug_page_alloc is enabled we must map the linear map
 * using pages. However, other mappings created by
 * create_mapping_noalloc must use sections in some cases. Allow
 * sections to be used in those cases, where no pgtable_alloc
 * function is provided.
 */

Does that sound ok to you?

As a future optimisation, I think we can allow sections when mapping
permanent kernel chunks (.e.g .rodata and .text), as these shouldn't
contain pages available for dynamic allocation. That would require using
something other than the presence of pgtable_alloc to determine when we
should force page usage.


+   if (pgtable_alloc && debug_pagealloc_enabled())
+   return false;
+
if (PAGE_SHIFT != 12)
return false;

@@ -241,7 +258,7 @@ static void alloc_init_pud(pgd_t *pgd, unsigned long addr, 
unsigned long end,
/*
 * For 4K granule only, attempt to put down a 1GB block
 */
-   if (use_1G_block(addr, next, phys)) {
+   if (use_1G_block(addr, next, phys, pgtable_alloc)) {
pud_t old_pud = *pud;
set_pud(pud, __pud(phys |

[PATCH] i2c: i801: Intel DNV_N device IDs SMBus

2016-02-05 Thread Alexandra Yates

Adding Intel codename DNV_N platform device IDs for SMBus.

Signed-off-by: Alexandra Yates 
---
 drivers/i2c/busses/i2c-i801.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index f62d697..fb1a3ca 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -60,6 +60,7 @@
  * BayTrail (SOC)  0x0f12  32  hardyes yes yes
  * Sunrise Point-H (PCH)   0xa123  32  hardyes yes yes
  * Sunrise Point-LP (PCH)  0x9d23  32  hardyes yes yes
+ * DNV_N (SOC) 0x19ac  32  hardyes yes yes
  * DNV (SOC)   0x19df  32  hardyes yes yes
  * Broxton (SOC)   0x5ad4  32  hardyes yes yes
  * Lewisburg (PCH) 0xa1a3  32  hardyes yes yes
@@ -206,6 +207,7 @@
 #define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_SMBUS  0x9ca2
 #define PCI_DEVICE_ID_INTEL_SUNRISEPOINT_H_SMBUS   0xa123
 #define PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_SMBUS  0x9d23
+#define PCI_DEVICE_ID_INTEL_DNV_N_SMBUS0x19ac
 #define PCI_DEVICE_ID_INTEL_DNV_SMBUS  0x19df
 #define PCI_DEVICE_ID_INTEL_BROXTON_SMBUS  0x5ad4
 #define PCI_DEVICE_ID_INTEL_LEWISBURG_SMBUS0xa1a3
@@ -871,6 +873,7 @@ static const struct pci_device_id i801_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_BRASWELL_SMBUS) },
{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 
PCI_DEVICE_ID_INTEL_SUNRISEPOINT_H_SMBUS) },
{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 
PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_SMBUS) },
+   { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_DNV_N_SMBUS) },
{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_DNV_SMBUS) },
{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_BROXTON_SMBUS) },
{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LEWISBURG_SMBUS) 
},
@@ -1271,6 +1274,7 @@ static int i801_probe(struct pci_dev *dev, const struct 
pci_device_id *id)
switch (dev->device) {
case PCI_DEVICE_ID_INTEL_SUNRISEPOINT_H_SMBUS:
case PCI_DEVICE_ID_INTEL_SUNRISEPOINT_LP_SMBUS:
+   case PCI_DEVICE_ID_INTEL_DNV_N_SMBUS:
case PCI_DEVICE_ID_INTEL_DNV_SMBUS:
priv->features |= FEATURE_I2C_BLOCK_READ;
priv->features |= FEATURE_IRQ;
-- 
1.9.1

Re: [PATCH v10 5/5] Watchdog: ARM SBSA Generic Watchdog half timeout panic support

2016-02-05 Thread Guenter Roeck


On 02/05/2016 10:21 AM, Fu Wei wrote:

On 5 February 2016 at 22:42, Guenter Roeck  wrote:

On 02/05/2016 01:51 AM, Fu Wei wrote:


Hi Guenter,

On 4 February 2016 at 13:17, Guenter Roeck  wrote:


On 02/03/2016 03:00 PM, Fu Wei wrote:



On 4 February 2016 at 02:45, Timur Tabi  wrote:



Fu Wei wrote:




As you know I have made the pre-timeout support patch, If people like
it, i am happy to go on upstream it separately.

If we want to use pre-timeout here, user only can use get_pretimeout
and disable panic by setting pretimeout to 0
but user can not really set pretimeout, because "pre-timeout  ==
timeout / 2 (always)".
if user want to change pretimeout, he/she has to set_time instead.





Ok, I think patches 4 and 5 should be combined, and I think the Kconfig
entry should be removed and just use panic_enabled.




Agreed.



np, will do




NP, will update this patchset like that ,  thanks :-)



Also, if panic is enabled, the timeout needs to be adjusted accordingly
(to only panic after the entire timeout period has expired, not after
half of it). We can not panic the system after timeout / 2.



OK, my thought is

if panic is enabled :
|WOR---WS0WOR---WS1
|--timeout--(panic)--timeout-reset

if panic is disabled .
|WOR---WS0WOR---WS1
|-timeout-reset

   panic_enabled only can be configured when module is loaded by module
parameter

But user should know that max_timeout(panic_enable) =
max_timeout(panic_disable) / 2



That means you'll have to update max_timeout accordingly.


panic_enabled only can be configured when module is loaded, so we
don't need to update it.

max_timeout will only be set up in the init stage.

Does it make sense ? :-)


Not sure I understand your problem or question.

max_timeout will have to reflect the correct maximum timeout, under
all circumstances. It will have to be set to the correct value before
the watchdog driver is registered.

Guenter

[GIT PULL] Power management and ACPI fixes for v4.5-rc3

2016-02-05 Thread Rafael J. Wysocki

Hi Linus,

Please pull from

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 pm+acpi-4.5-rc3

to receive power management and ACPI fixes for v4.5-rc3 with top-most
commit 79e2f8dd522873614eb31001745af487451a10de

 Merge branches 'pm-core' and 'pm-domains'

on top of commit 36f90b0a2ddd60823fe193a85e60ff1906c2a9b3

 Linux 4.5-rc2

These are: a fix for a recently introduced false-positive
warnings about PM domain pointers being changed inappropriately
(harmless but annoying), an MCH size workaround quirk for one
more platform, a compiler warning fix (generic power domains
framework), an ACPI LPSS (Intel SoCs) driver fixup and a cleanup
of the ACPI CPPC core code.

Specifics:

 - PM core fix to avoid false-positive warnings generated when
   the pm_domain field is cleared for a device that appears to
   be bound to a driver (Rafael Wysocki).

 - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).

 - Fix for an "unused function" compiler warning in the generic
   power domains framework (Ulf Hansson).

 - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set
   the PM domain pointer of a device properly in one place that
   was overlooked by a recent PM core update (Andy Shevchenko).

 - Removal of a redundant function declaration in the ACPI CPPC
   core code (Timur Tabi).

Thanks!

---

Andy Shevchenko (1):
  ACPI / LPSS: set PM domain via helper setter

Josh Boyer (1):
  PNP: Add Haswell-ULT to Intel MCH size workaround

Rafael J. Wysocki (1):
  PM: Avoid false-positive warnings in dev_pm_domain_set()

Timur Tabi (1):
  ACPI / CPPC: remove redundant mbox_send_message() declaration

Ulf Hansson (1):
  PM / Domains: Silence compiler warning for an unused function

---

 drivers/acpi/acpi_lpss.c|  2 +-
 drivers/base/power/common.c |  2 +-
 drivers/base/power/domain.c | 27 +++
 drivers/pnp/quirks.c|  1 +
 include/acpi/cppc_acpi.h|  1 -
 5 files changed, 10 insertions(+), 23 deletions(-)

Re: Bisected Regression 4.3.5 => 4.4.1 booting HP ZBook in EFI mode

2016-02-05 Thread Phil Turmel

On 02/05/2016 05:29 PM, Greg Kroah-Hartman wrote:
> On Fri, Feb 05, 2016 at 04:48:52PM -0500, Phil Turmel wrote:

>> I'm stumped as to how that powerpc patch can affect my x86 laptop, an
>> HP ZBook 17 w/ i7 processor & nouveau graphics, but it certainly
>> does.  The bisect was stable and I confirmed by reverting it on
>> top of the intended v4.4.1.
> 
> That's crazy, nothing should even be rebuilt if you revert that patch,
> so I don't see how that could affect things here.

I thought so too, but ...

> Can you verify that nothing does get rebuilt when you do this?

# git checkout v4.4.1

# make -j15
// lots of output ///

# make
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CHK include/generated/bounds.h
  CHK include/generated/timeconst.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  CHK kernel/config_data.h
Kernel: arch/x86/boot/bzImage is ready  (#133)
  Building modules, stage 2.
  MODPOST 1166 modules

# git revert badc688
// trimmed commit log ///
 1 file changed, 8 insertions(+), 8 deletions(-)

# make
  CHK include/config/kernel.release
  UPD include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  UPD include/generated/utsrelease.h
  CHK include/generated/bounds.h
  CHK include/generated/timeconst.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  CC  init/version.o
  LD  init/built-in.o
  CC  kernel/sys.o
  CC  kernel/trace/trace.o
  LD  kernel/trace/built-in.o
  CC  kernel/module.o
  CHK kernel/config_data.h
  LD  kernel/built-in.o
  CC  drivers/base/firmware_class.o
  LD  drivers/base/built-in.o
  CC  drivers/gpu/drm/i915/i915_gpu_error.o
  LD  drivers/gpu/drm/i915/i915.o
  LD  drivers/gpu/drm/i915/built-in.o
  LD  drivers/gpu/drm/built-in.o
  LD  drivers/gpu/built-in.o
  CC  drivers/target/target_core_configfs.o
  LD  drivers/target/target_core_mod.o
  LD  drivers/target/built-in.o
  CC [M]  drivers/vhost/scsi.o
  LD [M]  drivers/vhost/vhost_scsi.o
  LD  drivers/built-in.o
  LINKvmlinux
  LD  vmlinux.o
  MODPOST vmlinux.o
  GEN .version
  CHK include/generated/compile.h
  UPD include/generated/compile.h
  CC  init/version.o
  LD  init/built-in.o
  KSYM.tmp_kallsyms1.o
  KSYM.tmp_kallsyms2.o
  LD  vmlinux
  SORTEX  vmlinux
  SYSMAP  System.map
  VOFFSET arch/x86/boot/voffset.h
  OBJCOPY arch/x86/boot/compressed/vmlinux.bin
  LZMAarch/x86/boot/compressed/vmlinux.bin.lzma
  MKPIGGY arch/x86/boot/compressed/piggy.S
  AS  arch/x86/boot/compressed/piggy.o
  LD  arch/x86/boot/compressed/vmlinux
  ZOFFSET arch/x86/boot/zoffset.h
  AS  arch/x86/boot/header.o
  CC  arch/x86/boot/version.o
  LD  arch/x86/boot/setup.elf
  OBJCOPY arch/x86/boot/setup.bin
  OBJCOPY arch/x86/boot/vmlinux.bin
  BUILD   arch/x86/boot/bzImage
Setup is 15564 bytes (padded to 15872 bytes).
System is 26558 kB
CRC 29b78b83
Kernel: arch/x86/boot/bzImage is ready  (#134)
  Building modules, stage 2.
  MODPOST 1166 modules

//
So, a handful of items in the main kernel get
rebuilt, including i915 stuff.  This laptop does
have the intel graphics base with nvidia layered
on top.

The build proceeded to redo all my modules, shown below.

Phil

//

  CC  drivers/bcma/bcma.mod.o
  LD [M]  drivers/bcma/bcma.ko
  CC  drivers/block/drbd/drbd.mod.o
  LD [M]  drivers/block/drbd/drbd.ko
  CC  drivers/block/floppy.mod.o
  LD [M]  drivers/block/floppy.ko
  CC  drivers/block/mtip32xx/mtip32xx.mod.o
  LD [M]  drivers/block/mtip32xx/mtip32xx.ko
  CC  drivers/block/nbd.mod.o
  LD [M]  drivers/block/nbd.ko
  CC  drivers/block/rbd.mod.o
  LD [M]  drivers/block/rbd.ko
  CC  drivers/block/rsxx/rsxx.mod.o
  LD [M]  drivers/block/rsxx/rsxx.ko
  CC  drivers/block/virtio_blk.mod.o
  LD [M]  drivers/block/virtio_blk.ko
  CC  drivers/bluetooth/ath3k.mod.o
  LD [M]  drivers/bluetooth/ath3k.ko
  CC  drivers/bluetooth/bcm203x.mod.o
  LD [M]  drivers/bluetooth/bcm203x.ko
  CC  drivers/bluetooth/bfusb.mod.o
  LD [M]  drivers/bluetooth/bfusb.ko
  CC  drivers/bluetooth/bluecard_cs.mod.o
  LD [M]  drivers/bluetooth/bluecard_cs.ko
  CC  drivers/bluetooth/bpa10x.mod.o
  LD [M]  drivers/bluetooth/bpa10x.ko
  CC  drivers/bluetooth/bt3c_cs.mod.o
  LD [M]  drivers/bluetooth/bt3c_cs.ko
  CC  drivers/bluetooth/btbcm.mod.o
  LD [M]  drivers/bluetooth/btbcm.ko
  CC  drivers/bluetooth/btintel.mod.o
  LD [M]  drivers/bluetooth/btintel.ko
  CC  drivers/bluetooth/btmrvl.mod.o
  LD [M]  drivers/bluetooth/btmrvl.ko
  CC  drivers/bluetooth/btmrvl_sdio.mod.o
  LD [M]  drivers/bluetooth/btmrvl_sdio.ko
  CC

[PATCH v5 1/3] PCI: generic: Refactor code to enable reuse by other drivers.

2016-02-05 Thread David Daney

From: David Daney 

No change in functionality.

Move structure definitions into a separate header file.  Move common
code to new file with Kconfig machinery to build it.  Split probe
function in to two parts:

   - a small driver specific probe function (gen_pci_probe)

   - a common probe that can be used by other drivers
 (pci_host_common_probe)

Signed-off-by: David Daney 
Acked-by: Arnd Bergmann 
Acked-by: Will Deacon 
---
 MAINTAINERS |   1 +
 drivers/pci/host/Kconfig|   4 +
 drivers/pci/host/Makefile   |   1 +
 drivers/pci/host/pci-host-common.c  | 194 
 drivers/pci/host/pci-host-common.h  |  47 +
 drivers/pci/host/pci-host-generic.c | 181 +
 6 files changed, 251 insertions(+), 177 deletions(-)
 create mode 100644 drivers/pci/host/pci-host-common.c
 create mode 100644 drivers/pci/host/pci-host-common.h

diff --git a/MAINTAINERS b/MAINTAINERS
index a9010d9..9287929 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8367,6 +8367,7 @@ L:linux-...@vger.kernel.org
 L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
 S: Maintained
 F: Documentation/devicetree/bindings/pci/host-generic-pci.txt
+F: drivers/pci/host/pci-host-common.c
 F: drivers/pci/host/pci-host-generic.c
 
 PCI DRIVER FOR INTEL VOLUME MANAGEMENT DEVICE (VMD)
diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 75a6054..65709b4 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -53,9 +53,13 @@ config PCI_RCAR_GEN2_PCIE
help
  Say Y here if you want PCIe controller support on R-Car Gen2 SoCs.
 
+config PCI_HOST_COMMON
+   bool
+
 config PCI_HOST_GENERIC
bool "Generic PCI host controller"
depends on (ARM || ARM64) && OF
+   select PCI_HOST_COMMON
help
  Say Y here if you want to support a simple generic PCI host
  controller, such as the one emulated by kvmtool.
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 7b2f20c..3b24af8 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -6,6 +6,7 @@ obj-$(CONFIG_PCI_MVEBU) += pci-mvebu.o
 obj-$(CONFIG_PCI_TEGRA) += pci-tegra.o
 obj-$(CONFIG_PCI_RCAR_GEN2) += pci-rcar-gen2.o
 obj-$(CONFIG_PCI_RCAR_GEN2_PCIE) += pcie-rcar.o
+obj-$(CONFIG_PCI_HOST_COMMON) += pci-host-common.o
 obj-$(CONFIG_PCI_HOST_GENERIC) += pci-host-generic.o
 obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o
 obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone-dw.o pci-keystone.o
diff --git a/drivers/pci/host/pci-host-common.c 
b/drivers/pci/host/pci-host-common.c
new file mode 100644
index 000..e9f850f
--- /dev/null
+++ b/drivers/pci/host/pci-host-common.c
@@ -0,0 +1,194 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ *
+ * Copyright (C) 2014 ARM Limited
+ *
+ * Author: Will Deacon 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pci-host-common.h"
+
+static void gen_pci_release_of_pci_ranges(struct gen_pci *pci)
+{
+   pci_free_resource_list(>resources);
+}
+
+static int gen_pci_parse_request_of_pci_ranges(struct gen_pci *pci)
+{
+   int err, res_valid = 0;
+   struct device *dev = pci->host.dev.parent;
+   struct device_node *np = dev->of_node;
+   resource_size_t iobase;
+   struct resource_entry *win;
+
+   err = of_pci_get_host_bridge_resources(np, 0, 0xff, >resources,
+  );
+   if (err)
+   return err;
+
+   resource_list_for_each_entry(win, >resources) {
+   struct resource *parent, *res = win->res;
+
+   switch (resource_type(res)) {
+   case IORESOURCE_IO:
+   parent = _resource;
+   err = pci_remap_iospace(res, iobase);
+   if (err) {
+   dev_warn(dev, "error %d: failed to map resource 
%pR\n",
+err, res);
+   continue;
+   }
+   break;
+   case IORESOURCE_MEM:
+   parent = _resource;
+   res_valid |= !(res->flags & IORESOURCE_PREFETCH);
+   break;
+   case IORESOURCE_BUS:
+   pci->cfg.bus_range = res;
+   default:

[PATCH v5 3/3] pci, pci-thunder-ecam: Add driver for ThunderX-pass1 on-chip devices

2016-02-05 Thread David Daney

From: David Daney 

The cavium,pci-thunder-ecam devices are exactly ECAM based PCI root
complexes.  These root complexes (loosely referred to as ECAM units in
the hardware manuals) are used to access the Thunder on-chips devices.
They are special in that all the BARs on devices behind these root
complexes are at fixed addresses.  To handle this in a manner
compatible with the core PCI code, we have the config access functions
synthesize Enhanced Allocation (EA) capability entries for each BAR.

Since this EA synthesis is needed for exactly one chip model, we can
hard code some assumptions about the device topology and the
properties of specific DEVFNs in the driver.

Signed-off-by: David Daney 
---
 .../devicetree/bindings/pci/pci-thunder-ecam.txt   |  30 ++
 drivers/pci/host/Kconfig   |   7 +
 drivers/pci/host/Makefile  |   1 +
 drivers/pci/host/pci-thunder-ecam.c| 358 +
 4 files changed, 396 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-thunder-ecam.txt
 create mode 100644 drivers/pci/host/pci-thunder-ecam.c

diff --git a/Documentation/devicetree/bindings/pci/pci-thunder-ecam.txt 
b/Documentation/devicetree/bindings/pci/pci-thunder-ecam.txt
new file mode 100644
index 000..34658f2
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/pci-thunder-ecam.txt
@@ -0,0 +1,30 @@
+* ThunderX PCI host controller for pass-1.x silicon
+
+Firmware-initialized PCI host controller to on-chip devices found on
+some Cavium ThunderX processors.  These devices have ECAM based config
+access, but the BARs are all at fixed addresses.  We handle the fixed
+addresses by synthesizing Enhanced Allocation (EA) capabilities for
+these devices.
+
+The properties and their meanings are identical to those described in
+host-generic-pci.txt except as listed below.
+
+Properties of the host controller node that differ from
+host-generic-pci.txt:
+
+- compatible : Must be "cavium,pci-host-thunder-ecam"
+
+Example:
+
+   pci@84b0, {
+   compatible = "cavium,pci-host-thunder-ecam";
+   device_type = "pci";
+   msi-parent = <>;
+   msi-map = <0  0x3 0x1>;
+   bus-range = <0 31>;
+   #size-cells = <2>;
+   #address-cells = <3>;
+   #stream-id-cells = <1>;
+   reg = <0x84b0 0x 0 0x0200>;  /* Configuration space 
*/
+   ranges = <0x0300 0x8180 0x 0x8180 0x 0x80 
0x>; /* mem ranges */
+   };
diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 184df22..f8912c6 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -202,4 +202,11 @@ config PCI_HOST_THUNDER_PEM
help
  Say Y here if you want PCIe support for CN88XX Cavium Thunder SoCs.
 
+config PCI_HOST_THUNDER_ECAM
+   bool "Cavium Thunder ECAMe controller to on-chip devices on pass-1.x 
silicon"
+   depends on OF && ARM64
+   select PCI_HOST_COMMON
+   help
+ Say Y here if you want ECAM support for CN88XX-Pass-1.x Cavium 
Thunder SoCs.
+
 endmenu
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 8903172..d6af3ba 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -23,4 +23,5 @@ obj-$(CONFIG_PCIE_ALTERA) += pcie-altera.o
 obj-$(CONFIG_PCIE_ALTERA_MSI) += pcie-altera-msi.o
 obj-$(CONFIG_PCI_HISI) += pcie-hisi.o
 obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o
+obj-$(CONFIG_PCI_HOST_THUNDER_ECAM) += pci-thunder-ecam.o
 obj-$(CONFIG_PCI_HOST_THUNDER_PEM) += pci-thunder-pem.o
diff --git a/drivers/pci/host/pci-thunder-ecam.c 
b/drivers/pci/host/pci-thunder-ecam.c
new file mode 100644
index 000..83ee590
--- /dev/null
+++ b/drivers/pci/host/pci-thunder-ecam.c
@@ -0,0 +1,358 @@
+/*
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2015 Cavium, Inc.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pci-host-common.h"
+
+/* Mapping is standard ECAM */
+static void __iomem *thunder_ecam_map_bus(struct pci_bus *bus,
+ unsigned int devfn,
+ int where)
+{
+   struct gen_pci *pci = bus->sysdata;
+   resource_size_t idx = bus->number - pci->cfg.bus_range->start;
+
+   return pci->cfg.win[idx] + ((devfn << 12) | where);
+}
+
+static void set_val(u32 v, int where, int size, u32 *val)
+{
+   int shift = (where & 3) * 8;
+
+   pr_debug("set_val %04x: %08x\n", (unsigned)(where & ~3), v);
+   v >>= shift;
+   if (size == 1)
+   v &= 0xff;
+   else if (size == 2)
+   v &= 0x;
+   *val = v;
+}
+
+static int handle_ea_bar(u32 e0, int bar, struct pci_bus *bus,
+

[PATCH v5 0/3] Add host controller drivers for Cavium ThunderX PCI

2016-02-05 Thread David Daney

From: David Daney 

Some Cavium ThunderX processors require quirky access methods for the
config space of the PCIe bridge.

There are now three patches:

1) Refactor code in pci-host-generic so that it can more easily be
   used by other drivers.  This splits the driver for CAM and ECAM
   access methods to a separate file from the common host driver code.

2) Add the ThunderX PCIe driver to external PCIe buses, which
   leverages the code in pci-host-generic

3) Add ThunderX PCI driver for internel SoC buses used on early
   ThunderX chip revisions.

Changes from v4: Added patch 3/3.  Stylistic changes to 2/3 suggested
by Bjorn Helgaas.  When expanding config write width to 32-bits, mask
out unintened writes to W1C bits, also suggested by Bjorn Helgaas.

Changes from v3: Add some Acked-by, rebased to v4.5.0-rc1

Changes from v2: Improve device tree binding example as noted by Rob
Herring.  Rename pcie-thunder-pem.* to pci-thunder-pem.* for better
consistency.  Update MAINTAINERS to reflect the changes.

Changes from v1: Split CAM and ECAM code from common driver code as
suggested by Arnd Bergmann.  Fix spelling errors in
pcie-thunder-pem.txt

David Daney (3):
  PCI: generic: Refactor code to enable reuse by other drivers.
  pci, pci-thunder-pem: Add PCIe host driver for ThunderX processors.
  pci, pci-thunder-ecam: Add driver for ThunderX-pass1 on-chip devices

 .../devicetree/bindings/pci/pci-thunder-ecam.txt   |  30 ++
 .../devicetree/bindings/pci/pci-thunder-pem.txt|  43 +++
 MAINTAINERS|   9 +
 drivers/pci/host/Kconfig   |  18 ++
 drivers/pci/host/Makefile  |   3 +
 drivers/pci/host/pci-host-common.c | 194 +++
 drivers/pci/host/pci-host-common.h |  47 +++
 drivers/pci/host/pci-host-generic.c| 181 +--
 drivers/pci/host/pci-thunder-ecam.c| 358 +
 drivers/pci/host/pci-thunder-pem.c | 332 +++
 10 files changed, 1038 insertions(+), 177 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-thunder-ecam.txt
 create mode 100644 Documentation/devicetree/bindings/pci/pci-thunder-pem.txt
 create mode 100644 drivers/pci/host/pci-host-common.c
 create mode 100644 drivers/pci/host/pci-host-common.h
 create mode 100644 drivers/pci/host/pci-thunder-ecam.c
 create mode 100644 drivers/pci/host/pci-thunder-pem.c

-- 
1.8.3.1

[PATCH v5 2/3] pci, pci-thunder-pem: Add PCIe host driver for ThunderX processors.

2016-02-05 Thread David Daney

From: David Daney 

The root complexes used to access off-chip PCIe devices (called PEM
units in the hardware manuals) on some Cavium ThunderX processors
require quirky access methods for the config space of the PCIe bridge.
Add a driver to provide these config space accessor functions.  The
pci-host-common code is used to configure the PCI machinery.

Signed-off-by: David Daney 
Acked-by: Rob Herring 
Acked-by: Arnd Bergmann 
---
 .../devicetree/bindings/pci/pci-thunder-pem.txt|  43 +++
 MAINTAINERS|   8 +
 drivers/pci/host/Kconfig   |   7 +
 drivers/pci/host/Makefile  |   1 +
 drivers/pci/host/pci-thunder-pem.c | 332 +
 5 files changed, 391 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-thunder-pem.txt
 create mode 100644 drivers/pci/host/pci-thunder-pem.c

diff --git a/Documentation/devicetree/bindings/pci/pci-thunder-pem.txt 
b/Documentation/devicetree/bindings/pci/pci-thunder-pem.txt
new file mode 100644
index 000..f131fae
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/pci-thunder-pem.txt
@@ -0,0 +1,43 @@
+* ThunderX PEM PCIe host controller
+
+Firmware-initialized PCI host controller found on some Cavium
+ThunderX processors.
+
+The properties and their meanings are identical to those described in
+host-generic-pci.txt except as listed below.
+
+Properties of the host controller node that differ from
+host-generic-pci.txt:
+
+- compatible : Must be "cavium,pci-host-thunder-pem"
+
+- reg: Two entries: First the configuration space for down
+   stream devices base address and size, as accessed
+   from the parent bus. Second, the register bank of
+   the PEM device PCIe bridge.
+
+Example:
+
+pci@87e0,c200 {
+   compatible = "cavium,pci-host-thunder-pem";
+   device_type = "pci";
+   msi-parent = <>;
+   msi-map = <0  0x1 0x1>;
+   bus-range = <0x8f 0xc7>;
+   #size-cells = <2>;
+   #address-cells = <3>;
+
+   reg = <0x8880 0x8f00 0x0 0x3900>,  /* Configuration space */
+ <0x87e0 0xc200 0x0 0x0001>; /* PEM space */
+   ranges = <0x0100 0x00 0x0002 0x88b0 0x0002 0x00 
0x0001>, /* I/O */
+<0x0300 0x00 0x1000 0x8890 0x1000 0x0f 
0xf000>, /* mem64 */
+<0x4300 0x10 0x 0x88a0 0x 0x10 
0x>, /* mem64-pref */
+<0x0300 0x87e0 0xc2f0 0x87e0 0xc200 0x00 
0x0010>; /* mem64 PEM BAR4 */
+
+   #interrupt-cells = <1>;
+   interrupt-map-mask = <0 0 0 7>;
+   interrupt-map = <0 0 0 1  0 0 0 24 4>, /* INTA */
+   <0 0 0 2  0 0 0 25 4>, /* INTB */
+   <0 0 0 3  0 0 0 26 4>, /* INTC */
+   <0 0 0 4  0 0 0 27 4>; /* INTD */
+};
diff --git a/MAINTAINERS b/MAINTAINERS
index 9287929..ff5c367 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8413,6 +8413,14 @@ L: linux-arm-...@vger.kernel.org
 S: Maintained
 F: drivers/pci/host/*qcom*
 
+PCIE DRIVER FOR CAVIUM THUNDERX
+M: David Daney 
+L: linux-...@vger.kernel.org
+L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
+S: Supported
+F: Documentation/devicetree/bindings/pci/pci-thunder-*
+F: drivers/pci/host/pci-thunder-*
+
 PCMCIA SUBSYSTEM
 P: Linux PCMCIA Team
 L: linux-pcm...@lists.infradead.org
diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index 65709b4..184df22 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -195,4 +195,11 @@ config PCIE_QCOM
  PCIe controller uses the Designware core plus Qualcomm-specific
  hardware wrappers.
 
+config PCI_HOST_THUNDER_PEM
+   bool "Cavium Thunder PCIe controller to off-chip devices"
+   depends on OF && ARM64
+   select PCI_HOST_COMMON
+   help
+ Say Y here if you want PCIe support for CN88XX Cavium Thunder SoCs.
+
 endmenu
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 3b24af8..8903172 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -23,3 +23,4 @@ obj-$(CONFIG_PCIE_ALTERA) += pcie-altera.o
 obj-$(CONFIG_PCIE_ALTERA_MSI) += pcie-altera-msi.o
 obj-$(CONFIG_PCI_HISI) += pcie-hisi.o
 obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o
+obj-$(CONFIG_PCI_HOST_THUNDER_PEM) += pci-thunder-pem.o
diff --git a/drivers/pci/host/pci-thunder-pem.c 
b/drivers/pci/host/pci-thunder-pem.c
new file mode 100644
index 000..e3a6554
--- /dev/null
+++ b/drivers/pci/host/pci-thunder-pem.c
@@ -0,0 +1,332 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ *

[PATCH 17/30] rapidio/rionet: add locking into add/remove device

2016-02-05 Thread Alexandre Bounine

Add spinlock protection when handling list of connected peers and ability
to handle new peer device addition after the RIONET device was open.
Before his update RIONET was sending JOIN requests only when it have been
opened, peer devices added later have been missing from this process. 

Signed-off-by: Alexandre Bounine 
Cc: Matt Porter 
Cc: Aurelien Jacquiot 
Cc: Andre van Herk 
Cc: linux-kernel@vger.kernel.org
Cc: net...@vger.kernel.org
---
 drivers/net/rionet.c |  152 +
 1 files changed, 102 insertions(+), 50 deletions(-)

diff --git a/drivers/net/rionet.c b/drivers/net/rionet.c
index f994fa1..c15d958 100644
--- a/drivers/net/rionet.c
+++ b/drivers/net/rionet.c
@@ -63,6 +63,7 @@ struct rionet_private {
spinlock_t lock;
spinlock_t tx_lock;
u32 msg_enable;
+   bool open;
 };
 
 struct rionet_peer {
@@ -74,6 +75,7 @@ struct rionet_peer {
 struct rionet_net {
struct net_device *ndev;
struct list_head peers;
+   spinlock_t lock;/* net info access lock */
struct rio_dev **active;
int nact;   /* number of active peers */
 };
@@ -235,26 +237,32 @@ static void rionet_dbell_event(struct rio_mport *mport, 
void *dev_id, u16 sid, u
struct net_device *ndev = dev_id;
struct rionet_private *rnet = netdev_priv(ndev);
struct rionet_peer *peer;
+   unsigned char netid = rnet->mport->id;
 
if (netif_msg_intr(rnet))
printk(KERN_INFO "%s: doorbell sid %4.4x tid %4.4x info %4.4x",
   DRV_NAME, sid, tid, info);
if (info == RIONET_DOORBELL_JOIN) {
-   if (!nets[rnet->mport->id].active[sid]) {
-   list_for_each_entry(peer,
-  [rnet->mport->id].peers, node) {
+   if (!nets[netid].active[sid]) {
+   spin_lock([netid].lock);
+   list_for_each_entry(peer, [netid].peers, node) {
if (peer->rdev->destid == sid) {
-   nets[rnet->mport->id].active[sid] =
-   peer->rdev;
-   nets[rnet->mport->id].nact++;
+   nets[netid].active[sid] = peer->rdev;
+   nets[netid].nact++;
}
}
+   spin_unlock([netid].lock);
+
rio_mport_send_doorbell(mport, sid,
RIONET_DOORBELL_JOIN);
}
} else if (info == RIONET_DOORBELL_LEAVE) {
-   nets[rnet->mport->id].active[sid] = NULL;
-   nets[rnet->mport->id].nact--;
+   spin_lock([netid].lock);
+   if (nets[netid].active[sid]) {
+   nets[netid].active[sid] = NULL;
+   nets[netid].nact--;
+   }
+   spin_unlock([netid].lock);
} else {
if (netif_msg_intr(rnet))
printk(KERN_WARNING "%s: unhandled doorbell\n",
@@ -308,8 +316,10 @@ static void rionet_outb_msg_event(struct rio_mport *mport, 
void *dev_id, int mbo
 static int rionet_open(struct net_device *ndev)
 {
int i, rc = 0;
-   struct rionet_peer *peer, *tmp;
+   struct rionet_peer *peer;
struct rionet_private *rnet = netdev_priv(ndev);
+   unsigned char netid = rnet->mport->id;
+   unsigned long flags;
 
if (netif_msg_ifup(rnet))
printk(KERN_INFO "%s: open\n", DRV_NAME);
@@ -348,20 +358,13 @@ static int rionet_open(struct net_device *ndev)
netif_carrier_on(ndev);
netif_start_queue(ndev);
 
-   list_for_each_entry_safe(peer, tmp,
-[rnet->mport->id].peers, node) {
-   if (!(peer->res = rio_request_outb_dbell(peer->rdev,
-RIONET_DOORBELL_JOIN,
-
RIONET_DOORBELL_LEAVE)))
-   {
-   printk(KERN_ERR "%s: error requesting doorbells\n",
-  DRV_NAME);
-   continue;
-   }
-
+   spin_lock_irqsave([netid].lock, flags);
+   list_for_each_entry(peer, [netid].peers, node) {
/* Send a join message */
rio_send_doorbell(peer->rdev, RIONET_DOORBELL_JOIN);
}
+   spin_unlock_irqrestore([netid].lock, flags);
+   rnet->open = true;
 
   out:
return rc;
@@ -370,7 +373,9 @@ static int rionet_open(struct net_device *ndev)
 static int rionet_close(struct net_device *ndev)
 {
struct rionet_private *rnet = netdev_priv(ndev);
-   struct rionet_peer *peer, *tmp;
+   struct rionet_peer *peer;
+   unsigned char netid

[PATCH] mmc: mmc_spi: add checks for dma mapping error

2016-02-05 Thread Alexey Khoroshilov

There is no checks for dma mapping errors in mmc_spi.
Tha patch fixes that and by the way it adds dma_unmap_single(ones_dma)
that was left on a failure path mmc_spi_probe().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
---
 drivers/mmc/host/mmc_spi.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c
index 1c1b45ef3faf..3446097a43c0 100644
--- a/drivers/mmc/host/mmc_spi.c
+++ b/drivers/mmc/host/mmc_spi.c
@@ -925,6 +925,10 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct 
mmc_command *cmd,
 
dma_addr = dma_map_page(dma_dev, sg_page(sg), 0,
PAGE_SIZE, dir);
+   if (dma_mapping_error(dma_dev, dma_addr)) {
+   data->error = -EFAULT;
+   break;
+   }
if (direction == DMA_TO_DEVICE)
t->tx_dma = dma_addr + sg->offset;
else
@@ -1393,10 +1397,12 @@ static int mmc_spi_probe(struct spi_device *spi)
host->dma_dev = dev;
host->ones_dma = dma_map_single(dev, ones,
MMC_SPI_BLOCKSIZE, DMA_TO_DEVICE);
+   if (dma_mapping_error(dev, host->ones_dma))
+   goto fail_ones_dma;
host->data_dma = dma_map_single(dev, host->data,
sizeof(*host->data), DMA_BIDIRECTIONAL);
-
-   /* REVISIT in theory those map operations can fail... */
+   if (dma_mapping_error(dev, host->data_dma))
+   goto fail_data_dma;
 
dma_sync_single_for_cpu(host->dma_dev,
host->data_dma, sizeof(*host->data),
@@ -1462,6 +1468,11 @@ fail_glue_init:
if (host->dma_dev)
dma_unmap_single(host->dma_dev, host->data_dma,
sizeof(*host->data), DMA_BIDIRECTIONAL);
+fail_data_dma:
+   if (host->dma_dev)
+   dma_unmap_single(host->dma_dev, host->ones_dma,
+   MMC_SPI_BLOCKSIZE, DMA_TO_DEVICE);
+fail_ones_dma:
kfree(host->data);
 
 fail_nobuf1:
-- 
1.9.1

[PATCH 08/30] rapidio/tsi721: add query_mport callback

2016-02-05 Thread Alexandre Bounine

Add device-specific implementation of query_mport callback function.

Signed-off-by: Alexandre Bounine 
Cc: Matt Porter 
Cc: Aurelien Jacquiot 
Cc: Andre van Herk 
Cc: linux-kernel@vger.kernel.org
---
 drivers/rapidio/devices/tsi721.c |   34 ++
 1 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/drivers/rapidio/devices/tsi721.c b/drivers/rapidio/devices/tsi721.c
index d463d2c..cd40f0f 100644
--- a/drivers/rapidio/devices/tsi721.c
+++ b/drivers/rapidio/devices/tsi721.c
@@ -2288,6 +2288,39 @@ static int tsi721_messages_init(struct tsi721_device 
*priv)
 }
 
 /**
+ * tsi721_query_mport - Fetch inbound message from the Tsi721 MSG Queue
+ * @mport: Master port implementing the Inbound Messaging Engine
+ * @mbox: Inbound mailbox number
+ *
+ * Returns pointer to the message on success or NULL on failure.
+ */
+static int tsi721_query_mport(struct rio_mport *mport,
+ struct rio_mport_attr *attr)
+{
+   struct tsi721_device *priv = mport->priv;
+   u32 rval;
+
+   rval = ioread32(priv->regs + (0x100 + RIO_PORT_N_ERR_STS_CSR(0)));
+   if (rval & RIO_PORT_N_ERR_STS_PORT_OK) {
+   rval = ioread32(priv->regs + (0x100 + RIO_PORT_N_CTL2_CSR(0)));
+   attr->link_speed = (rval & RIO_PORT_N_CTL2_SEL_BAUD) >> 28;
+   rval = ioread32(priv->regs + (0x100 + RIO_PORT_N_CTL_CSR(0)));
+   attr->link_width = (rval & RIO_PORT_N_CTL_IPW) >> 27;
+   } else
+   attr->link_speed = RIO_LINK_DOWN;
+
+#ifdef CONFIG_RAPIDIO_DMA_ENGINE
+   attr->flags = RIO_MPORT_DMA | RIO_MPORT_DMA_SG;
+   attr->dma_max_sge = 0;
+   attr->dma_max_size = TSI721_BDMA_MAX_BCOUNT;
+   attr->dma_align = 0;
+#else
+   attr->flags = 0;
+#endif
+   return 0;
+}
+
+/**
  * tsi721_disable_ints - disables all device interrupts
  * @priv: pointer to tsi721 private data
  */
@@ -2372,6 +2405,7 @@ static int tsi721_setup_mport(struct tsi721_device *priv)
ops->get_inb_message = tsi721_get_inb_message;
ops->map_inb = tsi721_rio_map_inb_mem;
ops->unmap_inb = tsi721_rio_unmap_inb_mem;
+   ops->query_mport = tsi721_query_mport;
 
mport = kzalloc(sizeof(struct rio_mport), GFP_KERNEL);
if (!mport) {
-- 
1.7.8.4

[PATCH 27/30] rapidio/tsi721_dma: update error reporting from prep_sg callback

2016-02-05 Thread Alexandre Bounine

Switch to returning error-valued pointer instead of simple NULL pointer.
This allows to properly identify situation when request queue is full
and therefore gives to upper layer an option to retry operation later.

Signed-off-by: Alexandre Bounine 
Cc: Matt Porter 
Cc: Aurelien Jacquiot 
Cc: Andre van Herk 
Cc: linux-kernel@vger.kernel.org
---
 drivers/rapidio/devices/tsi721_dma.c |   37 +++--
 1 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/rapidio/devices/tsi721_dma.c 
b/drivers/rapidio/devices/tsi721_dma.c
index 494482e..5bc9071 100644
--- a/drivers/rapidio/devices/tsi721_dma.c
+++ b/drivers/rapidio/devices/tsi721_dma.c
@@ -767,7 +767,7 @@ struct dma_async_tx_descriptor *tsi721_prep_rio_sg(struct 
dma_chan *dchan,
void *tinfo)
 {
struct tsi721_bdma_chan *bdma_chan = to_tsi721_chan(dchan);
-   struct tsi721_tx_desc *desc, *_d;
+   struct tsi721_tx_desc *desc;
struct rio_dma_ext *rext = tinfo;
enum dma_rtype rtype;
struct dma_async_tx_descriptor *txd = NULL;
@@ -775,7 +775,7 @@ struct dma_async_tx_descriptor *tsi721_prep_rio_sg(struct 
dma_chan *dchan,
if (!sgl || !sg_len) {
tsi_err(>dev->device, "DMAC%d No SG list",
bdma_chan->id);
-   return NULL;
+   return ERR_PTR(-EINVAL);
}
 
tsi_debug(DMA, >dev->device, "DMAC%d %s", bdma_chan->id,
@@ -800,28 +800,33 @@ struct dma_async_tx_descriptor *tsi721_prep_rio_sg(struct 
dma_chan *dchan,
tsi_err(>dev->device,
"DMAC%d Unsupported DMA direction option",
bdma_chan->id);
-   return NULL;
+   return ERR_PTR(-EINVAL);
}
 
spin_lock_bh(_chan->lock);
 
-   list_for_each_entry_safe(desc, _d, _chan->free_list, desc_node) {
-   if (async_tx_test_ack(>txd)) {
-   list_del_init(>desc_node);
-   desc->destid = rext->destid;
-   desc->rio_addr = rext->rio_addr;
-   desc->rio_addr_u = 0;
-   desc->rtype = rtype;
-   desc->sg_len= sg_len;
-   desc->sg= sgl;
-   txd = >txd;
-   txd->flags  = flags;
-   break;
-   }
+   if (!list_empty(_chan->free_list)) {
+   desc = list_first_entry(_chan->free_list,
+   struct tsi721_tx_desc, desc_node);
+   list_del_init(>desc_node);
+   desc->destid = rext->destid;
+   desc->rio_addr = rext->rio_addr;
+   desc->rio_addr_u = 0;
+   desc->rtype = rtype;
+   desc->sg_len= sg_len;
+   desc->sg= sgl;
+   txd = >txd;
+   txd->flags  = flags;
}
 
spin_unlock_bh(_chan->lock);
 
+   if (!txd) {
+   tsi_debug(DMA, >dev->device,
+ "DMAC%d free TXD is not available", bdma_chan->id);
+   return ERR_PTR(-EBUSY);
+   }
+
return txd;
 }
 
-- 
1.7.8.4

[PATCH 09/30] rapidio: add shutdown notification for RapidIO devices

2016-02-05 Thread Alexandre Bounine

Add bus-specific callback to stop RapidIO devices during a system shutdown.

Signed-off-by: Alexandre Bounine 
Cc: Matt Porter 
Cc: Aurelien Jacquiot 
Cc: Andre van Herk 
Cc: linux-kernel@vger.kernel.org
---
 drivers/rapidio/rio-driver.c |   12 
 include/linux/rio.h  |2 ++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/drivers/rapidio/rio-driver.c b/drivers/rapidio/rio-driver.c
index f301f05..128350f 100644
--- a/drivers/rapidio/rio-driver.c
+++ b/drivers/rapidio/rio-driver.c
@@ -131,6 +131,17 @@ static int rio_device_remove(struct device *dev)
return 0;
 }
 
+static void rio_device_shutdown(struct device *dev)
+{
+   struct rio_dev *rdev = to_rio_dev(dev);
+   struct rio_driver *rdrv = rdev->driver;
+
+   dev_dbg(dev, "RIO: %s\n", __func__);
+
+   if (rdrv && rdrv->shutdown)
+   rdrv->shutdown(rdev);
+}
+
 /**
  *  rio_register_driver - register a new RIO driver
  *  @rdrv: the RIO driver structure to register
@@ -229,6 +240,7 @@ struct bus_type rio_bus_type = {
.bus_groups = rio_bus_groups,
.probe = rio_device_probe,
.remove = rio_device_remove,
+   .shutdown = rio_device_shutdown,
.uevent = rio_uevent,
 };
 
diff --git a/include/linux/rio.h b/include/linux/rio.h
index 8996a62..c64a0ba 100644
--- a/include/linux/rio.h
+++ b/include/linux/rio.h
@@ -423,6 +423,7 @@ struct rio_ops {
  * @id_table: RIO device ids to be associated with this driver
  * @probe: RIO device inserted
  * @remove: RIO device removed
+ * @shutdown: shutdown notification callback
  * @suspend: RIO device suspended
  * @resume: RIO device awakened
  * @enable_wake: RIO device enable wake event
@@ -437,6 +438,7 @@ struct rio_driver {
const struct rio_device_id *id_table;
int (*probe) (struct rio_dev * dev, const struct rio_device_id * id);
void (*remove) (struct rio_dev * dev);
+   void (*shutdown)(struct rio_dev *dev);
int (*suspend) (struct rio_dev * dev, u32 state);
int (*resume) (struct rio_dev * dev);
int (*enable_wake) (struct rio_dev * dev, u32 state, int enable);
-- 
1.7.8.4

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1498 matches

Mail list logo