Re: Weird NET_RX softirq behavior

2014-08-07 Thread Eric Dumazet
On Fri, 2014-08-08 at 10:37 +0800, Jisheng Zhang wrote:
> nd and recive 7 packets so far, about 1400bytes. Seems small
> comparing with the CPU1 and CPU2 NET_RX softirq numbers, right?
> 
> Any other possible case?

Multicast loop. Check dev_loopback_xmit() and its callers.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V3 1/2] ASoC: fsl_esai: refine esai for TDM support

2014-08-07 Thread Shengjiu Wang
Original driver didn't store the number of slots, just fix the slot number
to 2, use this default number to calculate bclk and pins for TX/RX.
In this patch, add one parameter for slots, and update the calculation of
bclk and pins of TX/RX. Then driver will be compatible with slots > 2 in
TDM mode.

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_esai.c |   14 +++---
 sound/soc/fsl/fsl_esai.h |8 
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index 72d154e..f252370 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -38,6 +38,7 @@
  * @fsysclk: system clock source to derive HCK, SCK and FS
  * @fifo_depth: depth of tx/rx FIFO
  * @slot_width: width of each DAI slot
+ * @slots: number of slots
  * @hck_rate: clock rate of desired HCKx clock
  * @sck_rate: clock rate of desired SCKx clock
  * @hck_dir: the direction of HCKx pads
@@ -56,6 +57,7 @@ struct fsl_esai {
struct clk *fsysclk;
u32 fifo_depth;
u32 slot_width;
+   u32 slots;
u32 hck_rate[2];
u32 sck_rate[2];
bool hck_dir[2];
@@ -363,6 +365,7 @@ static int fsl_esai_set_dai_tdm_slot(struct snd_soc_dai 
*dai, u32 tx_mask,
   ESAI_xSMB_xS_MASK, ESAI_xSMB_xS(rx_mask));
 
esai_priv->slot_width = slot_width;
+   esai_priv->slots = slots;
 
return 0;
 }
@@ -510,10 +513,11 @@ static int fsl_esai_hw_params(struct snd_pcm_substream 
*substream,
bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
u32 width = snd_pcm_format_width(params_format(params));
u32 channels = params_channels(params);
+   u32 pins = DIV_ROUND_UP(channels, esai_priv->slots);
u32 bclk, mask, val;
int ret;
 
-   bclk = params_rate(params) * esai_priv->slot_width * 2;
+   bclk = params_rate(params) * esai_priv->slot_width * esai_priv->slots;
 
ret = fsl_esai_set_bclk(dai, tx, bclk);
if (ret)
@@ -530,7 +534,7 @@ static int fsl_esai_hw_params(struct snd_pcm_substream 
*substream,
mask = ESAI_xFCR_xFR_MASK | ESAI_xFCR_xWA_MASK | ESAI_xFCR_xFWM_MASK |
  (tx ? ESAI_xFCR_TE_MASK | ESAI_xFCR_TIEN : ESAI_xFCR_RE_MASK);
val = ESAI_xFCR_xWA(width) | ESAI_xFCR_xFWM(esai_priv->fifo_depth) |
-(tx ? ESAI_xFCR_TE(channels) | ESAI_xFCR_TIEN : 
ESAI_xFCR_RE(channels));
+(tx ? ESAI_xFCR_TE(pins) | ESAI_xFCR_TIEN : ESAI_xFCR_RE(pins));
 
regmap_update_bits(esai_priv->regmap, REG_ESAI_xFCR(tx), mask, val);
 
@@ -565,6 +569,7 @@ static int fsl_esai_trigger(struct snd_pcm_substream 
*substream, int cmd,
struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
u8 i, channels = substream->runtime->channels;
+   u32 pins = DIV_ROUND_UP(channels, esai_priv->slots);
 
switch (cmd) {
case SNDRV_PCM_TRIGGER_START:
@@ -579,7 +584,7 @@ static int fsl_esai_trigger(struct snd_pcm_substream 
*substream, int cmd,
 
regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx),
   tx ? ESAI_xCR_TE_MASK : ESAI_xCR_RE_MASK,
-  tx ? ESAI_xCR_TE(channels) : 
ESAI_xCR_RE(channels));
+  tx ? ESAI_xCR_TE(pins) : ESAI_xCR_RE(pins));
break;
case SNDRV_PCM_TRIGGER_SUSPEND:
case SNDRV_PCM_TRIGGER_STOP:
@@ -783,6 +788,9 @@ static int fsl_esai_probe(struct platform_device *pdev)
/* Set a default slot size */
esai_priv->slot_width = 32;
 
+   /* Set a default slot number */
+   esai_priv->slots = 2;
+
/* Set a default master/slave state */
esai_priv->slave_mode = true;
 
diff --git a/sound/soc/fsl/fsl_esai.h b/sound/soc/fsl/fsl_esai.h
index 75e1403..91a550f 100644
--- a/sound/soc/fsl/fsl_esai.h
+++ b/sound/soc/fsl/fsl_esai.h
@@ -130,8 +130,8 @@
 #define ESAI_xFCR_RE_WIDTH 4
 #define ESAI_xFCR_TE_MASK  (((1 << ESAI_xFCR_TE_WIDTH) - 1) << 
ESAI_xFCR_xE_SHIFT)
 #define ESAI_xFCR_RE_MASK  (((1 << ESAI_xFCR_RE_WIDTH) - 1) << 
ESAI_xFCR_xE_SHIFT)
-#define ESAI_xFCR_TE(x)((ESAI_xFCR_TE_MASK >> (ESAI_xFCR_TE_WIDTH - 
((x + 1) >> 1))) & ESAI_xFCR_TE_MASK)
-#define ESAI_xFCR_RE(x)((ESAI_xFCR_RE_MASK >> (ESAI_xFCR_RE_WIDTH - 
((x + 1) >> 1))) & ESAI_xFCR_RE_MASK)
+#define ESAI_xFCR_TE(x)((ESAI_xFCR_TE_MASK >> (ESAI_xFCR_TE_WIDTH - 
x)) & ESAI_xFCR_TE_MASK)
+#define ESAI_xFCR_RE(x)((ESAI_xFCR_RE_MASK >> (ESAI_xFCR_RE_WIDTH - 
x)) & ESAI_xFCR_RE_MASK)
 #define ESAI_xFCR_xFR_SHIFT1
 #define ESAI_xFCR_xFR_MASK (1 << ESAI_xFCR_xFR_SHIFT)
 #define ESAI_xFCR_xFR  (1 << ESAI_xFCR_xFR_SHIFT)
@@ -272,8 +272,8 @@
 #define ESAI_xCR_RE_WIDTH  4
 #define ESAI_xCR_TE_MASK   (((1 << ESAI_xCR_TE_WIDTH) - 1) << 
ESAI_xCR_xE_SHIFT)
 #define ESAI_xCR_RE_MASK   (((1 << ESAI_xCR_RE_WIDTH) - 

Re: [patch] mm, hugetlb_cgroup: align hugetlb cgroup limit to hugepage size

2014-08-07 Thread Aneesh Kumar K.V
David Rientjes  writes:

> Memcg aligns memory.limit_in_bytes to PAGE_SIZE as part of the resource 
> counter
> since it makes no sense to allow a partial page to be charged.
>
> As a result of the hugetlb cgroup using the resource counter, it is also 
> aligned
> to PAGE_SIZE but makes no sense unless aligned to the size of the hugepage 
> being
> limited.
>
> Align hugetlb cgroup limit to hugepage size.
>
> Signed-off-by: David Rientjes 
> ---
>  mm/hugetlb_cgroup.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -275,6 +275,8 @@ static ssize_t hugetlb_cgroup_write(struct 
> kernfs_open_file *of,
>   ret = res_counter_memparse_write_strategy(buf, );
>   if (ret)
>   break;
> + val = ALIGN(val, 1 << (huge_page_order([idx]) +
> +PAGE_SHIFT));

you can use  1UL << huge_page_shift(hstate); ?

>   ret = res_counter_set_limit(_cg->hugepage[idx], val);
>   break;
>   default:
>

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the akpm-current tree with Linus' tree

2014-08-07 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in
mm/memcontrol.c between commit 61e02c745721 ("mm: memcontrol: clean up
reclaim size variable use in try_charge()") from Linus' tree and
various commits from the akpm-current tree.

I fixed it up (I think) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


[PATCH V2] regulator: DA9211 : support DA9213

2014-08-07 Thread James Ban
This is a patch for supporting DA9213.

Signed-off-by: James Ban 
---

This patch is relative to linux-next repository tag next-20140807.

Changes in V2:
- Changed a method for selecting driver table.

 drivers/regulator/Kconfig|9 ++--
 drivers/regulator/da9211-regulator.c |   95 +++---
 drivers/regulator/da9211-regulator.h |7 ++-
 include/linux/regulator/da9211.h |7 ++-
 4 files changed, 91 insertions(+), 27 deletions(-)

diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 2dc8289..1344aa8 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -199,13 +199,14 @@ config REGULATOR_DA9210
  interface.
 
 config REGULATOR_DA9211
-   tristate "Dialog Semiconductor DA9211/DA9212 regulator"
+   tristate "Dialog Semiconductor DA9211/DA9212/DA9213/DA9214 regulator"
depends on I2C
select REGMAP_I2C
help
- Say y here to support for the Dialog Semiconductor DA9211/DA9212.
- The DA9211/DA9212 is a multi-phase synchronous step down
- converter 12A DC-DC Buck controlled through an I2C
+ Say y here to support for the Dialog Semiconductor DA9211/DA9212
+ /DA9213/DA9214.
+ The DA9211/DA9212/DA9213/DA9214 is a multi-phase synchronous
+ step down converter 12A or 16A DC-DC Buck controlled through an I2C
  interface.
 
 config REGULATOR_DBX500_PRCMU
diff --git a/drivers/regulator/da9211-regulator.c 
b/drivers/regulator/da9211-regulator.c
index 1482ada..ccc2e36 100644
--- a/drivers/regulator/da9211-regulator.c
+++ b/drivers/regulator/da9211-regulator.c
@@ -1,5 +1,5 @@
 /*
- * da9211-regulator.c - Regulator device driver for DA9211
+ * da9211-regulator.c - Regulator device driver for DA9211/DA9213
  * Copyright (C) 2014  Dialog Semiconductor Ltd.
  *
  * This library is free software; you can redistribute it and/or
@@ -27,6 +27,10 @@
 #include 
 #include "da9211-regulator.h"
 
+/* DEVICE IDs */
+#define DA9211_DEVICE_ID   0x22
+#define DA9213_DEVICE_ID   0x23
+
 #define DA9211_BUCK_MODE_SLEEP 1
 #define DA9211_BUCK_MODE_SYNC  2
 #define DA9211_BUCK_MODE_AUTO  3
@@ -42,6 +46,7 @@ struct da9211 {
struct regulator_dev *rdev[DA9211_MAX_REGULATORS];
int num_regulator;
int chip_irq;
+   int chip_id;
 };
 
 static const struct regmap_range_cfg da9211_regmap_range[] = {
@@ -52,14 +57,14 @@ static const struct regmap_range_cfg da9211_regmap_range[] 
= {
.window_start = 0,
.window_len = 256,
.range_min = 0,
-   .range_max = 2*256,
+   .range_max = 5*128,
},
 };
 
 static const struct regmap_config da9211_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
-   .max_register = 2 * 256,
+   .max_register = 5 * 128,
.ranges = da9211_regmap_range,
.num_ranges = ARRAY_SIZE(da9211_regmap_range),
 };
@@ -69,11 +74,20 @@ static const struct regmap_config da9211_regmap_config = {
 #define DA9211_MAX_MV  1570
 #define DA9211_STEP_MV 10
 
-/* Current limits for buck (uA) indices corresponds with register values */
+/* Current limits for DA9211 buck (uA) indices
+ * corresponds with register values
+ */
 static const int da9211_current_limits[] = {
200, 220, 240, 260, 280, 300, 320, 340,
360, 380, 400, 420, 440, 460, 480, 500
 };
+/* Current limits for DA9213 buck (uA) indices
+ * corresponds with register values
+ */
+static const int da9213_current_limits[] = {
+   300, 320, 340, 360, 380, 400, 420, 440,
+   460, 480, 500, 520, 540, 560, 580, 600
+};
 
 static unsigned int da9211_buck_get_mode(struct regulator_dev *rdev)
 {
@@ -129,12 +143,26 @@ static int da9211_set_current_limit(struct regulator_dev 
*rdev, int min,
 {
int id = rdev_get_id(rdev);
struct da9211 *chip = rdev_get_drvdata(rdev);
-   int i;
+   int i, max_size;
+   const int *current_limits;
+
+   switch (chip->chip_id) {
+   case DA9211:
+   current_limits = da9211_current_limits;
+   max_size = ARRAY_SIZE(da9211_current_limits)-1;
+   break;
+   case DA9213:
+   current_limits = da9213_current_limits;
+   max_size = ARRAY_SIZE(da9213_current_limits)-1;
+   break;
+   default:
+   return -EINVAL;
+   }
 
/* search for closest to maximum */
-   for (i = ARRAY_SIZE(da9211_current_limits)-1; i >= 0; i--) {
-   if (min <= da9211_current_limits[i] &&
-   max >= da9211_current_limits[i]) {
+   for (i = max_size; i >= 0; i--) {
+   if (min <= current_limits[i] &&
+   max >= current_limits[i]) {
   

Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup

2014-08-07 Thread Davidlohr Bueso
On Thu, 2014-08-07 at 17:45 -0700, Davidlohr Bueso wrote:
> On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
> > On a highly contended rwsem, spinlock contention due to the slow
> > rwsem_wake() call can be a significant portion of the total CPU cycles
> > used. With writer lock stealing and writer optimistic spinning, there
> > is also a pretty good chance that the lock may have been stolen
> > before the waker wakes up the waiters. The woken tasks, if any,
> > will have to go back to sleep again.
> 
> Good catch! And this applies to mutexes as well. How about something
> like this:
> 
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index dadbf88..e037588 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -707,6 +707,20 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
>  
>  #endif
>  
> +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)

If DEBUG, we don't clear the owner when unlocking. This can just be 

+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER

> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> + struct task_struct *owner = ACCESS_ONCE(lock->owner);
> +
> + return owner != NULL;
> +}
> +#else
> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> + return false;
> +}
> +#endif
> +
>  /*
>   * Release the lock, slowpath:
>   */
> @@ -734,6 +748,15 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int 
> nested)
>   mutex_release(>dep_map, nested, _RET_IP_);
>   debug_mutex_unlock(lock);
>  
> + /*
> +  * Abort the wakeup operation if there is an active writer as the
> +  * lock was stolen. mutex_unlock() should have cleared the owner field
> +  * before calling this function. If that field is now set, there must
> +  * be an active writer present.
> +  */
> + if (mutex_has_owner(lock))
> + goto done;

Err so we actually deadlock here because we do the check with the
lock->wait_lock held and at the same time another task comes into the
slowpath of a mutex_lock() call which also tries to take the wait_lock.
Ending up with hung tasks. Here's a more tested patch against
peterz-queue, survives aim7 and kernel builds on a 80core box. Thanks.


8<---
From: Davidlohr Bueso 
Subject: [PATCH] locking/mutex: Do not falsely wake-up tasks

Mutexes lock-stealing functionality allows another task to
skip its turn in the wait-queue and atomically acquire the lock.
This is fine and a nice optimization, however, when releasing
the mutex, we always wakeup the next task in FIFO order. When
the lock has been stolen this leads to wasting waking up a
task just to immediately realize it cannot acquire the lock
and just go back to sleep. This is specially true on highly
contended mutexes that stress the wait_lock.

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/mutex.c | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index dadbf88..52e1136 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -383,12 +383,26 @@ done:
 
return false;
 }
+
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+   struct task_struct *owner = ACCESS_ONCE(lock->owner);
+
+   return owner != NULL;
+}
+
 #else
+
 static bool mutex_optimistic_spin(struct mutex *lock,
  struct ww_acquire_ctx *ww_ctx, const bool 
use_ww_ctx)
 {
return false;
 }
+
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+   return false;
+}
 #endif
 
 __visible __used noinline
@@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int 
nested)
if (__mutex_slowpath_needs_to_unlock())
atomic_set(>count, 1);
 
+/*
+ * Skipping the mutex_has_owner() check when DEBUG, allows us to
+ * avoid taking the wait_lock in order to do not call mutex_release()
+ * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
+ * deadlocks when another task enters the lock's slowpath in mutex_lock().
+ */
+#ifndef CONFIG_DEBUG_MUTEXES
+   /*
+* Abort the wakeup operation if there is an another mutex owner, as the
+* lock was stolen. mutex_unlock() should have cleared the owner field
+* before calling this function. If that field is now set, another task
+* must have acquired the mutex.
+*/
+   if (mutex_has_owner(lock))
+   return;
+#endif
+
spin_lock_mutex(>wait_lock, flags);
mutex_release(>dep_map, nested, _RET_IP_);
debug_mutex_unlock(lock);
@@ -744,7 +775,6 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int 
nested)
 
wake_up_process(waiter->task);
}
-
spin_unlock_mutex(>wait_lock, flags);
 }
 
-- 
1.8.1.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

[PATCH 3/4] xen-pciback: use pci device flag operation helper function

2014-08-07 Thread Ethan Zhao
Use pci device flag operation helper functions when set device
to assigned or deassigned state.

Acked-by: David Vrabel 
Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Ethan Zhao 
---
 drivers/xen/xen-pciback/pci_stub.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c 
b/drivers/xen/xen-pciback/pci_stub.c
index d57a173..e593921 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -133,7 +133,7 @@ static void pcistub_device_release(struct kref *kref)
xen_pcibk_config_free_dyn_fields(dev);
xen_pcibk_config_free_dev(dev);
 
-   dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
+   pci_clear_dev_assigned(dev);
pci_dev_put(dev);
 
kfree(psdev);
@@ -413,7 +413,7 @@ static int pcistub_init_device(struct pci_dev *dev)
dev_dbg(>dev, "reset device\n");
xen_pcibk_reset_device(dev);
 
-   dev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
+   pci_set_dev_assigned(dev);
return 0;
 
 config_release:
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/4 resend] Introduce device assignment flag operation helper function

2014-08-07 Thread Ethan Zhao
This patch set introduces three PCI device flag operation helper functions
when set pci device PF/VF to assigned or deassigned status also check it.
and patch 2,3,4 apply these helper functions to KVM,XEN and PCI.

v2: simplify unnecessory ternary operation in function pci_is_dev_assigned().
v3: amend helper function naming.

Appreciate suggestion from
alex.william...@redhat.com,
david.vra...@citrix.com,
alexander.h.du...@intel.com

Resend for v3.16 building.

Thanks,
Ethan
---
Ethan Zhao (4):
  PCI: introduce helper functions for device flag operation
  KVM: use pci device flag operation helper functions
  xen-pciback: use pci device flag operation helper function
  PCI: use device flag operation helper function in iov.c

 drivers/pci/iov.c  |2 +-
 drivers/xen/xen-pciback/pci_stub.c |4 ++--
 include/linux/pci.h|   13 +
 virt/kvm/assigned-dev.c|2 +-
 virt/kvm/iommu.c   |4 ++--
 5 files changed, 19 insertions(+), 6 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] PCI: introduce helper functions for device flag operation

2014-08-07 Thread Ethan Zhao
This patch introduced three helper functions to hide direct
device flag operation.

void pci_set_dev_assigned(struct pci_dev *pdev);
void pci_clear_dev_assigned(struct pci_dev *pdev);
bool pci_is_dev_assigned(struct pci_dev *pdev);

Signed-off-by: Ethan Zhao 
---
 v2: simplify unnecessory ternary operation in function pci_is_dev_assigned();
 v3: amend helper functions naming.
---
 include/linux/pci.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 466bcd1..b610ab3 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1829,4 +1829,17 @@ int pci_for_each_dma_alias(struct pci_dev *pdev,
  */
 struct pci_dev *pci_find_upstream_pcie_bridge(struct pci_dev *pdev);
 
+/* helper functions for operation of device flag */
+static inline void pci_set_dev_assigned(struct pci_dev *pdev)
+{
+   pdev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
+}
+static inline void pci_clear_dev_assigned(struct pci_dev *pdev)
+{
+   pdev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
+}
+static inline bool pci_is_dev_assigned(struct pci_dev *pdev)
+{
+   return pdev->dev_flags & PCI_DEV_FLAGS_ASSIGNED;
+}
 #endif /* LINUX_PCI_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] PCI: use device flag operation helper function in iov.c

2014-08-07 Thread Ethan Zhao
Use device flag operation helper functions when check device
assignment status.

Signed-off-by: Ethan Zhao 
---
 drivers/pci/iov.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index cb6f247..4d109c0 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -633,7 +633,7 @@ int pci_vfs_assigned(struct pci_dev *dev)
 * our dev as the physical function and the assigned bit is set
 */
if (vfdev->is_virtfn && (vfdev->physfn == dev) &&
-   (vfdev->dev_flags & PCI_DEV_FLAGS_ASSIGNED))
+   pci_is_dev_assigned(vfdev))
vfs_assigned++;
 
vfdev = pci_get_device(dev->vendor, dev_id, vfdev);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] KVM: use pci device flag operation helper functions

2014-08-07 Thread Ethan Zhao
Use helper function instead of direct operation to pci device
flag when set device to assigned or deassigned.

Acked-by: Paolo Bonzini 
Signed-off-by: Ethan Zhao 
---
 virt/kvm/assigned-dev.c |2 +-
 virt/kvm/iommu.c|4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index bf06577..38581ee 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -302,7 +302,7 @@ static void kvm_free_assigned_device(struct kvm *kvm,
else
pci_restore_state(assigned_dev->dev);
 
-   assigned_dev->dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
+   pci_clear_dev_assigned(assigned_dev->dev);
 
pci_release_regions(assigned_dev->dev);
pci_disable_device(assigned_dev->dev);
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 0df7d4b..34a8b02 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -194,7 +194,7 @@ int kvm_assign_device(struct kvm *kvm,
goto out_unmap;
}
 
-   pdev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
+   pci_set_dev_assigned(pdev);
 
dev_info(>dev, "kvm assign device\n");
 
@@ -220,7 +220,7 @@ int kvm_deassign_device(struct kvm *kvm,
 
iommu_detach_device(domain, >dev);
 
-   pdev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
+   pci_clear_dev_assigned(pdev);
 
dev_info(>dev, "kvm deassign device\n");
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next 0/2] xen-netback: Changes around carrier handling

2014-08-07 Thread David Miller
From: Zoltan Kiss 
Date: Thu, 7 Aug 2014 16:51:17 +0100

> Ok, how about this:
> * ndo_start_xmit tries to set up the grant copy operations, something
> * which is done now in the thread
> * no estimation madness, just go ahead and try to do it
> * if the skb can fit, kick the thread
> * if it fails (not enough slots to complete the TX), then:
>   * call netif_tx_stop_queue on that queue (just like now)
>   * set up timer rx_stalled (just like now)
>   * save the state of the current skb (where the grant copy op setup is
>   * halted)
> * if new slots coming in, continue to create the grant copy ops for the
> * stalled skb, and if it succeeds, kick the thread plus call
> * netif_tx_start_queue. (just like now)
> * if the timer fires, drop the stalled skb, and set the carrier off, so
> * QDisc won't bother to queue packets for a stalled interface
> * the thread will only do the actual grant copy hypercall and releasing
> * the skb
> * in any case, ndo_start_xmit should return NETDEV_TX_OK, just like now

It sounds like this would work, and indeed it would abide by the intended
rules of netif_{stop,wake}_queue() and ->ndo_start_xmit()'s return
values.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/lustre: use rcu_dereference to access rcu protected current->real_parent field

2014-08-07 Thread Greg Kroah-Hartman
On Fri, Aug 08, 2014 at 01:06:15AM -0400, Oleg Drokin wrote:
> 
> On Aug 8, 2014, at 12:42 AM, Greg Kroah-Hartman wrote:
> 
> > On Fri, Aug 08, 2014 at 12:03:20AM -0400, Oleg Drokin wrote:
> >> Hello!
> >> 
> >> On Aug 7, 2014, at 11:49 PM, Greg Kroah-Hartman wrote:
>  
>  This is not a critical bug and in the worst case the code here may
>  cause miss of statistics counter increase.
>  This is why I think it is not worth to backport the patch at all.
> >>> You are right, and if this is just for some random "statistics" file,
> >>> can we just delete the whole function?
> >> 
> >> I hope not!
> >> This is used all around the client to tally up various operations executed 
> >> counts.
> > Why would you do that?  Why would they care?
> 
> We would do that to provide information on the client operations performed.
> They would care because they are interested in what particular clients might 
> be doing.
> 
> >> The statistic is then used by various userspace monitoring tools.
> > Why not use the in-kernel monitoring tools instead of creating your own?
> > What does userspace do with that information?
> 
> We don't really control the userspace tools. People write tools to suit their 
> needs
> to monitor loads, see odd things the end users are doing or possibly for some
> debugging even.
> Correlating these numbers with what server sees also proves useful at times
> (write combining for example).
> 
> Here's a sample of output of a recently mounted client that I poked on a bit 
> (the lines starting with # are my comments):
> # cat /proc/fs/lustre/llite/lustre-88008dde27f0/stats
> snapshot_time 1407473168.466102 secs.usecs
> read_bytes1 samples [bytes] 0 0 0
> write_bytes   4 samples [bytes] 2 7 19
> osc_write 4 samples [bytes] 2 7 19
> # The bytes counts show you minimum, maximum of writes seen and total number 
> of bytes read-written.
> # Lustre (and many other network filesystems) is very sensitive to small IO, 
> esp. reads so it's good
> # to know if you have a lot of it.
> open  6 samples [regs]
> # The "regs" type just shows you how many of given type operations were 
> performed since last statistic reset.
> # Frequently that allows people to guess where does high load come from on a 
> particular client when
> # it's otherwise not obvious because not a lot of cpu is used.
> # Some operations are heavier than others too.
> close 6 samples [regs]
> readdir   4 samples [regs]
> setattr   1 samples [regs]
> truncate  4 samples [regs]
> getattr   7 samples [regs]
> create1 samples [regs]
> alloc_inode   1 samples [regs]
> getxattr  8 samples [regs]
> inode_permission  28 samples [regs]
> 
> As more operations types are seen the list grows.
> Then there are also specific stats for readahead (data and metadata) so that 
> interested people can make informed
> decisions on the tuning there should they be unsatisfied with default 
> settings.
> 
> I am not sure there's a similar mechanism in the kernel already that
> would allow us to get this sort of data easily all in one place?

perf should show you this, if not, please add the functionality there.
A filesystem is not the place to have performance monitoring code, this
needs to be removed before it can be moved out of staging.  Please work
with the trace/perf developers on this if there is something lacking
there.

thanks,

greg k-h
dG

> 
> Bye,
> Oleg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next 0/2] xen-netback: Changes around carrier handling

2014-08-07 Thread David Miller
From: Zoltan Kiss 
Date: Thu, 7 Aug 2014 17:49:37 +0100

> David Vrabel pointed out an important question in a reply to the
> previous version of this series: this patch deschedule NAPI if the
> carrier goes down. The backend doesn't receive packets from the
> guest. DavidVr and others said we shouldn't do this, the guest should
> be able to transmit even if it's not able/willing to receive. Other
> drivers doesn't deschedule NAPI at carrier off as well, however the
> "carrier off" information comes from the hardware, not from an
> untrusted guest who is not posting buffers on the receive ring.
> I don't have any good argument why I did it the current way, other
> than a hunch that it feels more natural.
> David, do you have an opinion on that?

Unless you have a strong reason for doing so, I don't think disabling
receives when the TX path backs up is necessary.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 07/10 v3] coresight-etm: add CoreSight ETM/PTM driver

2014-08-07 Thread Dirk Behme

On 07.08.2014 20:21, mathieu.poir...@linaro.org wrote:

From: Pratik Patel 

This driver manages CoreSight ETM (Embedded Trace Macrocell) that
supports processor tracing. Currently supported version are ARM
ETMv3.x and PTM1.x.

Signed-off-by: Pratik Patel 
Signed-off-by: Panchaxari Prasannamurthy 
Signed-off-by: Mathieu Poirier 
---

...

+static struct amba_id etm_ids[] = {
+   {   /* ETM 3.3 */
+   .id = 0x0003b921,
+   .mask   = 0x0003,
+   },
+   {   /* ETM 3.5 */
+   .id = 0x0003b956,
+   .mask   = 0x0003,
+   },
+   {   /* PTM */
+   .id = 0x0003b95f,
+   .mask   = 0x0003,
+   },
+   { 0, 0},


Maybe you like to add PTM 1.0 [1] here, too?

Best regards

Dirk

[1]

diff --git a/drivers/coresight/coresight-etm.c 
b/drivers/coresight/coresight-etm.c

index 6f5dbc7..a7a08e6 100644
--- a/drivers/coresight/coresight-etm.c
+++ b/drivers/coresight/coresight-etm.c
@@ -1284,6 +1284,8 @@ static bool etm_arch_supported(u8 arch)
break;
case ETM_ARCH_V3_5:
break;
+   case PFT_ARCH_V1_0:
+   break;
case PFT_ARCH_V1_1:
break;
default:
@@ -1418,6 +1420,7 @@ static int etm_probe(struct amba_device *adev, 
const struct amba_id *id)

put_online_cpus();

if (etm_arch_supported(drvdata->arch) == false) {
+   dev_err(dev, "ETM arch 0x%02x not supported\n", drvdata->arch);
ret = -EINVAL;
goto err_arch_supported;
}
@@ -1472,11 +1475,15 @@ static struct amba_id etm_ids[] = {
.id = 0x0003b921,
.mask   = 0x0003,
},
+   {   /* PTM 1.0 */
+   .id = 0x0003b950,
+   .mask   = 0x0003,
+   },
{   /* ETM 3.5 */
.id = 0x0003b956,
.mask   = 0x0003,
},
-   {   /* PTM */
+   {   /* PTM 1.1 */
.id = 0x0003b95f,
.mask   = 0x0003,
},
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index a19420e..596ec94 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -32,6 +32,7 @@

 #define ETM_ARCH_V3_3  (0x23)
 #define ETM_ARCH_V3_5  (0x25)
+#define PFT_ARCH_V1_0  (0x30)
 #define PFT_ARCH_V1_1  (0x31)

 #define CORESIGHT_UNLOCK   (0xC5ACCE55)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V3 2/2] Revert "ASoC: fsl-esai: Add .xlate_tdm_slot_mask() support."

2014-08-07 Thread Shengjiu Wang
This reverts commit a603c8ee526f5ea9ad9b40710308766299ad8a69.

fsl_asoc_xlate_tdm_slot_mask() will invert the mask, for esai the enabled bit
is changed to be disabled, So the esai will work abnormally.
when there is no definition of .xlate_tdm_slot_mask, there is default function
snd_soc_xlate_tdm_slot_mask(), which is workable for esai.

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/Kconfig|1 -
 sound/soc/fsl/fsl_esai.c |2 --
 2 files changed, 3 deletions(-)

diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
index f54a8fc..f3012b6 100644
--- a/sound/soc/fsl/Kconfig
+++ b/sound/soc/fsl/Kconfig
@@ -49,7 +49,6 @@ config SND_SOC_FSL_ESAI
tristate "Enhanced Serial Audio Interface (ESAI) module support"
select REGMAP_MMIO
select SND_SOC_IMX_PCM_DMA if SND_IMX_SOC != n
-   select SND_SOC_FSL_UTILS
help
  Say Y if you want to add Enhanced Synchronous Audio Interface
  (ESAI) support for the Freescale CPUs.
diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index f252370..b2f6b3e 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -18,7 +18,6 @@
 
 #include "fsl_esai.h"
 #include "imx-pcm.h"
-#include "fsl_utils.h"
 
 #define FSL_ESAI_RATES SNDRV_PCM_RATE_8000_192000
 #define FSL_ESAI_FORMATS   (SNDRV_PCM_FMTBIT_S8 | \
@@ -612,7 +611,6 @@ static struct snd_soc_dai_ops fsl_esai_dai_ops = {
.hw_params = fsl_esai_hw_params,
.set_sysclk = fsl_esai_set_dai_sysclk,
.set_fmt = fsl_esai_set_dai_fmt,
-   .xlate_tdm_slot_mask = fsl_asoc_xlate_tdm_slot_mask,
.set_tdm_slot = fsl_esai_set_dai_tdm_slot,
 };
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH V3 0/2] refine esai for tdm support

2014-08-07 Thread Shengjiu Wang
These patchs is to refine esai for tdm support.

Changes for V3
- update the comments for patch 2

Changes for V2
- update the comments according the reviewer's suggestion
- add init value for slots and change pin to pins.


Shengjiu Wang (2):
  ASoC: fsl_esai: refine esai for TDM support
  Revert "ASoC: fsl-esai: Add .xlate_tdm_slot_mask() support."

 sound/soc/fsl/Kconfig|1 -
 sound/soc/fsl/fsl_esai.c |   16 +++-
 sound/soc/fsl/fsl_esai.h |8 
 3 files changed, 15 insertions(+), 10 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/lustre: use rcu_dereference to access rcu protected current->real_parent field

2014-08-07 Thread Oleg Drokin

On Aug 8, 2014, at 12:42 AM, Greg Kroah-Hartman wrote:

> On Fri, Aug 08, 2014 at 12:03:20AM -0400, Oleg Drokin wrote:
>> Hello!
>> 
>> On Aug 7, 2014, at 11:49 PM, Greg Kroah-Hartman wrote:
 
 This is not a critical bug and in the worst case the code here may
 cause miss of statistics counter increase.
 This is why I think it is not worth to backport the patch at all.
>>> You are right, and if this is just for some random "statistics" file,
>>> can we just delete the whole function?
>> 
>> I hope not!
>> This is used all around the client to tally up various operations executed 
>> counts.
> Why would you do that?  Why would they care?

We would do that to provide information on the client operations performed.
They would care because they are interested in what particular clients might be 
doing.

>> The statistic is then used by various userspace monitoring tools.
> Why not use the in-kernel monitoring tools instead of creating your own?
> What does userspace do with that information?

We don't really control the userspace tools. People write tools to suit their 
needs
to monitor loads, see odd things the end users are doing or possibly for some
debugging even.
Correlating these numbers with what server sees also proves useful at times
(write combining for example).

Here's a sample of output of a recently mounted client that I poked on a bit 
(the lines starting with # are my comments):
# cat /proc/fs/lustre/llite/lustre-88008dde27f0/stats
snapshot_time 1407473168.466102 secs.usecs
read_bytes1 samples [bytes] 0 0 0
write_bytes   4 samples [bytes] 2 7 19
osc_write 4 samples [bytes] 2 7 19
# The bytes counts show you minimum, maximum of writes seen and total number of 
bytes read-written.
# Lustre (and many other network filesystems) is very sensitive to small IO, 
esp. reads so it's good
# to know if you have a lot of it.
open  6 samples [regs]
# The "regs" type just shows you how many of given type operations were 
performed since last statistic reset.
# Frequently that allows people to guess where does high load come from on a 
particular client when
# it's otherwise not obvious because not a lot of cpu is used.
# Some operations are heavier than others too.
close 6 samples [regs]
readdir   4 samples [regs]
setattr   1 samples [regs]
truncate  4 samples [regs]
getattr   7 samples [regs]
create1 samples [regs]
alloc_inode   1 samples [regs]
getxattr  8 samples [regs]
inode_permission  28 samples [regs]

As more operations types are seen the list grows.
Then there are also specific stats for readahead (data and metadata) so that 
interested people can make informed
decisions on the tuning there should they be unsatisfied with default settings.

I am not sure there's a similar mechanism in the kernel already that would 
allow us to get this sort of data easily
all in one place?

Bye,
Oleg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 3/10] ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate

2014-08-07 Thread Namjae Jeon
This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4.

1) Make sure that both offset and len are block size aligned.
2) Update the i_size of inode by len bytes.
3) Compute the file's logical block number against offset. If the computed
   block number is not the starting block of the extent, split the extent
   such that the block number is the starting block of the extent.
4) Shift all the extents which are lying bewteen [offset, last allocated extent]
   towards right by len bytes. This step will make a hole of len bytes
   at offset.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
Changelog

v5:
 - remove allocation part.

 fs/ext4/ext4.h  |   1 +
 fs/ext4/extents.c   | 322 ++--
 include/trace/events/ext4.h |  25 
 3 files changed, 339 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 5b19760..03bd3ec 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2730,6 +2730,7 @@ extern int ext4_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,
__u64 start, __u64 len);
 extern int ext4_ext_precache(struct inode *inode);
 extern int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len);
+extern int ext4_insert_range(struct file *file, loff_t offset, loff_t len);
 
 /* move_extent.c */
 extern void ext4_double_down_write_data_sem(struct inode *first,
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 76c2df3..a1e0635 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4891,7 +4891,8 @@ long ext4_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
 
/* Return error if mode is not supported */
if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
-FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE))
+FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |
+FALLOC_FL_INSERT_RANGE))
return -EOPNOTSUPP;
 
if (mode & FALLOC_FL_PUNCH_HOLE)
@@ -4914,6 +4915,9 @@ long ext4_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
if (mode & FALLOC_FL_ZERO_RANGE)
return ext4_zero_range(file, offset, len, mode);
 
+   if (mode & FALLOC_FL_INSERT_RANGE)
+   return ext4_insert_range(file, offset, len);
+
trace_ext4_fallocate_enter(inode, offset, len, mode);
lblk = offset >> blkbits;
/*
@@ -5216,13 +5220,13 @@ ext4_access_path(handle_t *handle, struct inode *inode,
 }
 
 /*
- * ext4_ext_shift_path_extents:
+ * ext4_ext_shift_path_extents_left:
  * Shift the extents of a path structure lying between path[depth].p_ext
- * and EXT_LAST_EXTENT(path[depth].p_hdr) downwards, by subtracting shift
+ * and EXT_LAST_EXTENT(path[depth].p_hdr) to the left, by subtracting shift
  * from starting block for each extent.
  */
 static int
-ext4_ext_shift_path_extents(struct ext4_ext_path *path, ext4_lblk_t shift,
+ext4_ext_shift_path_extents_left(struct ext4_ext_path *path, ext4_lblk_t shift,
struct inode *inode, handle_t *handle,
ext4_lblk_t *start)
 {
@@ -5292,13 +5296,13 @@ out:
 }
 
 /*
- * ext4_ext_shift_extents:
+ * ext4_ext_shift_extents_left:
  * All the extents which lies in the range from start to the last allocated
- * block for the file are shifted downwards by shift blocks.
+ * block for the file are shifted to the left by shift blocks.
  * On success, 0 is returned, error otherwise.
  */
 static int
-ext4_ext_shift_extents(struct inode *inode, handle_t *handle,
+ext4_ext_shift_extents_left(struct inode *inode, handle_t *handle,
   ext4_lblk_t start, ext4_lblk_t shift)
 {
struct ext4_ext_path *path;
@@ -5378,7 +5382,7 @@ ext4_ext_shift_extents(struct inode *inode, handle_t 
*handle,
break;
}
}
-   ret = ext4_ext_shift_path_extents(path, shift, inode,
+   ret = ext4_ext_shift_path_extents_left(path, shift, inode,
handle, );
ext4_ext_drop_refs(path);
kfree(path);
@@ -5483,7 +5487,7 @@ int ext4_collapse_range(struct inode *inode, loff_t 
offset, loff_t len)
}
ext4_discard_preallocations(inode);
 
-   ret = ext4_ext_shift_extents(inode, handle, punch_stop,
+   ret = ext4_ext_shift_extents_left(inode, handle, punch_stop,
 punch_stop - punch_start);
if (ret) {
up_write(_I(inode)->i_data_sem);
@@ -5508,3 +5512,303 @@ out_mutex:
mutex_unlock(>i_mutex);
return ret;
 }
+
+/*
+ * ext4_ext_shift_path_extents_right:
+ * Shift the extents of a path structure towards right, by adding shift_lblk
+ * to the starting ee_block of each extent. Shifting is done from
+ * the last extent in the path till we reach first extent OR hit start_lblk.
+ * 

[PATCH v5 5/10] xfstests: generic/029: Standard insert range tests

2014-08-07 Thread Namjae Jeon
This testcase(029) tries to test various corner cases for finsert range
functionality over different type of extents.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 common/punch  |  5 
 common/rc |  2 +-
 tests/generic/029 | 65 ++
 tests/generic/029.out | 78 +++
 tests/generic/group   |  1 +
 5 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 tests/generic/029
 create mode 100644 tests/generic/029.out

diff --git a/common/punch b/common/punch
index f2d538c..4cc50d5 100644
--- a/common/punch
+++ b/common/punch
@@ -527,6 +527,11 @@ _test_generic_punch()
return
fi
 
+   # If zero_cmd is finsert, don't check unaligned offsets
+   if [ "$zero_cmd" == "finsert" ]; then
+   return
+   fi
+
echo "  16. data -> cache cold ->hole"
if [ "$remove_testfile" ]; then
rm -f $testfile
diff --git a/common/rc b/common/rc
index 2c83340..6d805cf 100644
--- a/common/rc
+++ b/common/rc
@@ -1272,7 +1272,7 @@ _require_xfs_io_command()
"falloc" )
testio=`$XFS_IO_PROG -F -f -c "falloc 0 1m" $testfile 2>&1`
;;
-   "fpunch" | "fcollapse" | "zero" | "fzero" )
+   "fpunch" | "fcollapse" | "zero" | "fzero" | "finsert" )
testio=`$XFS_IO_PROG -F -f -c "pwrite 0 20k" -c "fsync" \
-c "$command 4k 8k" $testfile 2>&1`
;;
diff --git a/tests/generic/029 b/tests/generic/029
new file mode 100644
index 000..2b18069
--- /dev/null
+++ b/tests/generic/029
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/029
+#
+# Standard insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#---
+# Copyright (c) 2013 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch falloc fpunch finsert fiemap _filter_hole_fiemap $testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/029.out b/tests/generic/029.out
new file mode 100644
index 000..572fa45
--- /dev/null
+++ b/tests/generic/029.out
@@ -0,0 +1,78 @@
+QA output created by 029
+   1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+   2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+   3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+cf845a781c107ec1346e849c9dd1b7e8
+   4. hole -> data
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+   5. hole -> unwritten
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+   6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+be0f35d4292a20040766d87883b0abd1
+   7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+be0f35d4292a20040766d87883b0abd1
+   8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+   9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+   10. hole -> data -> 

[PATCH v5 9/10] xfstests: fsstress: Add fallocate insert range

2014-08-07 Thread Namjae Jeon
This commit adds insert operation support for fsstress, which is
meant to exercise fallocate FALLOC_FL_INSERT_RANGE support.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 ltp/fsstress.c | 19 ---
 src/global.h   |  4 
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/ltp/fsstress.c b/ltp/fsstress.c
index b56fe5c..aa3e0c3 100644
--- a/ltp/fsstress.c
+++ b/ltp/fsstress.c
@@ -72,6 +72,7 @@ typedef enum {
OP_PUNCH,
OP_ZERO,
OP_COLLAPSE,
+   OP_INSERT,
OP_READ,
OP_READLINK,
OP_RENAME,
@@ -170,6 +171,7 @@ voidmknod_f(int, long);
 void   punch_f(int, long);
 void   zero_f(int, long);
 void   collapse_f(int, long);
+void   insert_f(int, long);
 void   read_f(int, long);
 void   readlink_f(int, long);
 void   rename_f(int, long);
@@ -209,6 +211,7 @@ opdesc_tops[] = {
{ OP_PUNCH, "punch", punch_f, 1, 1 },
{ OP_ZERO, "zero", zero_f, 1, 1 },
{ OP_COLLAPSE, "collapse", collapse_f, 1, 1 },
+   { OP_INSERT, "insert", insert_f, 1, 1 },
{ OP_READ, "read", read_f, 1, 0 },
{ OP_READLINK, "readlink", readlink_f, 1, 0 },
{ OP_RENAME, "rename", rename_f, 2, 1 },
@@ -2176,6 +2179,7 @@ struct print_flags falloc_flags [] = {
{ FALLOC_FL_NO_HIDE_STALE, "NO_HIDE_STALE"},
{ FALLOC_FL_COLLAPSE_RANGE, "COLLAPSE_RANGE"},
{ FALLOC_FL_ZERO_RANGE, "ZERO_RANGE"},
+   { FALLOC_FL_INSERT_RANGE, "INSERT_RANGE"},
{ -1, NULL}
 };
 
@@ -2227,10 +2231,11 @@ do_fallocate(int opno, long r, int mode)
off %= maxfsize;
len = (off64_t)(random() % (1024 * 1024));
/*
-* Collapse range requires off and len to be block aligned, make it
-* more likely to be the case.
+* Collapse/insert range requires off and len to be block aligned,
+* make it more likely to be the case.
 */
-   if ((mode & FALLOC_FL_COLLAPSE_RANGE) && (opno % 2)) {
+   if ((mode & (FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_INSERT_RANGE)) &&
+   (opno % 2)) {
off = ((off + stb.st_blksize - 1) & ~(stb.st_blksize - 1));
len = ((len + stb.st_blksize - 1) & ~(stb.st_blksize - 1));
}
@@ -2656,6 +2661,14 @@ collapse_f(int opno, long r)
 }
 
 void
+insert_f(int opno, long r)
+{
+#ifdef HAVE_LINUX_FALLOC_H
+   do_fallocate(opno, r, FALLOC_FL_INSERT_RANGE);
+#endif
+}
+
+void
 read_f(int opno, long r)
 {
char*buf;
diff --git a/src/global.h b/src/global.h
index 8180f66..f63246b 100644
--- a/src/global.h
+++ b/src/global.h
@@ -172,6 +172,10 @@
 #define FALLOC_FL_ZERO_RANGE   0x10
 #endif
 
+#ifndef FALLOC_FL_INSERT_RANGE
+#define FALLOC_FL_INSERT_RANGE 0x20
+#endif
+
 #endif /* HAVE_LINUX_FALLOC_H */
 
 #endif /* GLOBAL_H */
-- 
1.7.11-rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 7/10] xfstests: generic/031: Multi insert range tests

2014-08-07 Thread Namjae Jeon
This testcase(031) tries to test various corner cases with pre-existing holes
for finsert range functionality over different type of extents.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 tests/generic/031 | 65 +
 tests/generic/031.out | 80 +++
 tests/generic/group   |  1 +
 3 files changed, 146 insertions(+)
 create mode 100644 tests/generic/031
 create mode 100644 tests/generic/031.out

diff --git a/tests/generic/031 b/tests/generic/031
new file mode 100644
index 000..0e18f87
--- /dev/null
+++ b/tests/generic/031
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/031
+#
+# Multi insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#---
+# Copyright (c) 2013 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch -k falloc fpunch finsert fiemap _filter_hole_fiemap 
$testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/031.out b/tests/generic/031.out
new file mode 100644
index 000..ae45352
--- /dev/null
+++ b/tests/generic/031.out
@@ -0,0 +1,80 @@
+QA output created by 031
+   1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+   2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+   3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+22b7303d274481990b5401b6263effe0
+   4. hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..55]: extent
+c4fef62ba1de9d91a977cfeec6632f19
+   5. hole -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..55]: extent
+1ca74f7572a0f4ab477fdbb5682e5f61
+   6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..47]: hole
+4: [48..55]: extent
+be0f35d4292a20040766d87883b0abd1
+   7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+bddb1f3895268acce30d516a99cb0f2f
+   8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..39]: hole
+4: [40..55]: extent
+f8fc47adc45b7cf72f988b3ddf5bff64
+   9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+c4fef62ba1de9d91a977cfeec6632f19
+   10. hole -> data -> hole
+0: [0..7]: extent
+1: [8..39]: hole
+2: [40..63]: extent
+52af1bfcbf43f28af2328de32e0567e5
+   11. data -> hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..39]: extent
+3: [40..47]: hole
+4: [48..63]: extent
+e3a8d52acc4d91a8ed19d7b6f4f26a71
+   12. unwritten -> data -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+52af1bfcbf43f28af2328de32e0567e5
+   13. data -> unwritten -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+2b22165f4a24a2c36fd05ef00b41df88
+   14. data -> hole @ EOF
+0: [0..23]: extent
+1: [24..39]: hole
+2: [40..55]: extent
+aa0f20d1edcdbce60d8ef82700ba30c3
+   15. data -> hole @ 0
+0: [0..15]: hole
+1: [16..55]: extent
+86c9d033be2761385c9cfa203c426bb2
diff --git a/tests/generic/group b/tests/generic/group
index b847c47..544d422 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -33,6 +33,7 @@
 028 auto quick
 029 auto quick prealloc
 030 auto quick prealloc
+031 auto quick 

[PATCH v5 4/10] xfsprogs: xfs_io: add finsert command for insert range via fallocate

2014-08-07 Thread Namjae Jeon
Add finsert command for fallocate FALLOC_FL_INSERT_RANGE flag.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 io/prealloc.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/io/prealloc.c b/io/prealloc.c
index aba6b44..11b1e12 100644
--- a/io/prealloc.c
+++ b/io/prealloc.c
@@ -37,6 +37,10 @@
 #define FALLOC_FL_ZERO_RANGE 0x10
 #endif
 
+#ifndef FALLOC_FL_INSERT_RANGE
+#define FALLOC_FL_INSERT_RANGE 0x20
+#endif
+
 static cmdinfo_t allocsp_cmd;
 static cmdinfo_t freesp_cmd;
 static cmdinfo_t resvsp_cmd;
@@ -46,6 +50,7 @@ static cmdinfo_t zero_cmd;
 static cmdinfo_t falloc_cmd;
 static cmdinfo_t fpunch_cmd;
 static cmdinfo_t fcollapse_cmd;
+static cmdinfo_t finsert_cmd;
 static cmdinfo_t fzero_cmd;
 #endif
 
@@ -169,11 +174,14 @@ fallocate_f(
int mode = 0;
int c;
 
-   while ((c = getopt(argc, argv, "ckp")) != EOF) {
+   while ((c = getopt(argc, argv, "cikp")) != EOF) {
switch (c) {
case 'c':
mode = FALLOC_FL_COLLAPSE_RANGE;
break;
+   case 'i':
+   mode = FALLOC_FL_INSERT_RANGE;
+   break;
case 'k':
mode = FALLOC_FL_KEEP_SIZE;
break;
@@ -237,6 +245,25 @@ fcollapse_f(
 }
 
 static int
+finsert_f(
+   int argc,
+   char**argv)
+{
+   xfs_flock64_t   segment;
+   int mode = FALLOC_FL_INSERT_RANGE;
+
+   if (!offset_length(argv[1], argv[2], ))
+   return 0;
+
+   if (fallocate(file->fd, mode,
+   segment.l_start, segment.l_len)) {
+   perror("fallocate");
+   return 0;
+   }
+   return 0;
+}
+
+static int
 fzero_f(
int argc,
char**argv)
@@ -345,6 +372,16 @@ prealloc_init(void)
_("de-allocates space and eliminates the hole by shifting extents");
add_command(_cmd);
 
+   finsert_cmd.name = "finsert";
+   finsert_cmd.cfunc = finsert_f;
+   finsert_cmd.argmin = 2;
+   finsert_cmd.argmax = 2;
+   finsert_cmd.flags = CMD_NOMAP_OK | CMD_FOREIGN_OK;
+   finsert_cmd.args = _("off len");
+   finsert_cmd.oneline =
+   _("creates new space for writing within file by shifting extents");
+   add_command(_cmd);
+
fzero_cmd.name = "fzero";
fzero_cmd.cfunc = fzero_f;
fzero_cmd.argmin = 2;
-- 
1.7.11-rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 10/10] xfstests: fsx: Add fallocate insert range operation

2014-08-07 Thread Namjae Jeon
This commit adds fallocate FALLOC_FL_INSERT_RANGE support for fsx.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 ltp/fsx.c | 96 +--
 1 file changed, 94 insertions(+), 2 deletions(-)

diff --git a/ltp/fsx.c b/ltp/fsx.c
index 47d3ee8..9dc1655 100644
--- a/ltp/fsx.c
+++ b/ltp/fsx.c
@@ -95,7 +95,8 @@ int   logcount = 0;   /* total ops */
 #define OP_PUNCH_HOLE  6
 #define OP_ZERO_RANGE  7
 #define OP_COLLAPSE_RANGE  8
-#define OP_MAX_FULL9
+#define OP_INSERT_RANGE9
+#define OP_MAX_FULL10
 
 /* operation modifiers */
 #define OP_CLOSEOPEN   100
@@ -145,6 +146,7 @@ int fallocate_calls = 1;/* -F flag disables 
*/
 int punch_hole_calls = 1;   /* -H flag disables */
 int zero_range_calls = 1;   /* -z flag disables */
 intcollapse_range_calls = 1;   /* -C flag disables */
+intinsert_range_calls = 1; /* -i flag disables */
 intmapped_reads = 1;   /* -R flag disables it */
 intfsxgoodfd = 0;
 into_direct;   /* -Z */
@@ -339,6 +341,14 @@ logdump(void)
 lp->args[0] + lp->args[1])
prt("\t**");
break;
+   case OP_INSERT_RANGE:
+   prt("INSERT 0x%x thru 0x%x\t(0x%x bytes)",
+   lp->args[0], lp->args[0] + lp->args[1] - 1,
+   lp->args[1]);
+   if (badoff >= lp->args[0] && badoff <
+lp->args[0] + lp->args[1])
+   prt("\t**");
+   break;
case OP_SKIPPED:
prt("SKIPPED (no operation)");
break;
@@ -1012,6 +1022,59 @@ do_collapse_range(unsigned offset, unsigned length)
 }
 #endif
 
+#ifdef FALLOC_FL_INSERT_RANGE
+void
+do_insert_range(unsigned offset, unsigned length)
+{
+   unsigned end_offset;
+   int mode = FALLOC_FL_INSERT_RANGE;
+
+   if (length == 0) {
+   if (!quiet && testcalls > simulatedopcount)
+   prt("skipping zero length insert range\n");
+   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, length);
+   return;
+   }
+
+   if ((loff_t)offset >= file_size) {
+   if (!quiet && testcalls > simulatedopcount)
+   prt("skipping insert range behind EOF\n");
+   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, length);
+   return;
+   }
+
+   log4(OP_INSERT_RANGE, offset, length, 0);
+
+   if (testcalls <= simulatedopcount)
+   return;
+
+   end_offset = offset + length;
+   if ((progressinterval && testcalls % progressinterval == 0) ||
+   (debug && (monitorstart == -1 || monitorend == -1 ||
+ end_offset <= monitorend))) {
+   prt("%lu insert\tfrom 0x%x to 0x%x, (0x%x bytes)\n", testcalls,
+   offset, offset+length, length);
+   }
+   if (fallocate(fd, mode, (loff_t)offset, (loff_t)length) == -1) {
+   prt("insert range: %x to %x\n", offset, length);
+   prterr("do_insert_range: fallocate");
+   report_failure(161);
+   }
+
+   memmove(good_buf + end_offset, good_buf + offset,
+   file_size - offset);
+   memset(good_buf + offset, '\0', length);
+   file_size += length;
+}
+
+#else
+void
+do_insert_range(unsigned offset, unsigned length)
+{
+   return;
+}
+#endif
+
 #ifdef HAVE_LINUX_FALLOC_H
 /* fallocate is basically a no-op unless extending, then a lot like a truncate 
*/
 void
@@ -1192,6 +1255,12 @@ test(void)
goto out;
}
break;
+   case OP_INSERT_RANGE:
+   if (!insert_range_calls) {
+   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, size);
+   goto out;
+   }
+   break;
}
 
switch (op) {
@@ -1244,6 +1313,21 @@ test(void)
}
do_collapse_range(offset, size);
break;
+   case OP_INSERT_RANGE:
+   TRIM_OFF_LEN(offset, size, (maxfilelen - 1) - file_size);
+   offset = offset & ~(block_size - 1);
+   size = size & ~(block_size - 1);
+   if (size == 0) {
+   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, size);
+   goto out;
+   }
+   if (file_size + size > maxfilelen) {
+   log4(OP_SKIPPED, OP_INSERT_RANGE, offset, size);
+   goto out;
+   }
+
+   do_insert_range(offset, size);
+   break;
default:
prterr("test: unknown 

[PATCH v5 8/10] xfstests: generic/032: Delayed allocation multi insert

2014-08-07 Thread Namjae Jeon
This testcase(032) tries to test various corner cases with delayed extents and
pre-existing holes for finsert range functionality over different type of
extents.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 tests/generic/032 | 65 +
 tests/generic/032.out | 80 +++
 tests/generic/group   |  1 +
 3 files changed, 146 insertions(+)
 create mode 100644 tests/generic/032
 create mode 100644 tests/generic/032.out

diff --git a/tests/generic/032 b/tests/generic/032
new file mode 100644
index 000..32798f3
--- /dev/null
+++ b/tests/generic/032
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/032
+#
+# Delayed allocation multi insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#---
+# Copyright (c) 2013 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch -d -k falloc fpunch finsert fiemap _filter_hole_fiemap 
$testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/032.out b/tests/generic/032.out
new file mode 100644
index 000..6d7acfd
--- /dev/null
+++ b/tests/generic/032.out
@@ -0,0 +1,80 @@
+QA output created by 032
+   1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+   2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+   3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+22b7303d274481990b5401b6263effe0
+   4. hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..55]: extent
+c4fef62ba1de9d91a977cfeec6632f19
+   5. hole -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..55]: extent
+1ca74f7572a0f4ab477fdbb5682e5f61
+   6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..47]: hole
+4: [48..55]: extent
+be0f35d4292a20040766d87883b0abd1
+   7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+bddb1f3895268acce30d516a99cb0f2f
+   8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..39]: hole
+4: [40..55]: extent
+f8fc47adc45b7cf72f988b3ddf5bff64
+   9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+c4fef62ba1de9d91a977cfeec6632f19
+   10. hole -> data -> hole
+0: [0..7]: extent
+1: [8..39]: hole
+2: [40..63]: extent
+52af1bfcbf43f28af2328de32e0567e5
+   11. data -> hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..39]: extent
+3: [40..47]: hole
+4: [48..63]: extent
+e3a8d52acc4d91a8ed19d7b6f4f26a71
+   12. unwritten -> data -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+52af1bfcbf43f28af2328de32e0567e5
+   13. data -> unwritten -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+2b22165f4a24a2c36fd05ef00b41df88
+   14. data -> hole @ EOF
+0: [0..23]: extent
+1: [24..39]: hole
+2: [40..55]: extent
+aa0f20d1edcdbce60d8ef82700ba30c3
+   15. data -> hole @ 0
+0: [0..15]: hole
+1: [16..55]: extent
+86c9d033be2761385c9cfa203c426bb2
diff --git a/tests/generic/group b/tests/generic/group
index 544d422..01d719b 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -34,6 +34,7 @@
 029 auto quick prealloc
 030 auto quick 

[PATCH v5 6/10] xfstests: generic/030: Delayed allocation insert range

2014-08-07 Thread Namjae Jeon
This testcase(030) tries to test various corner cases with delayed extents
for finsert range functionality over different type of extents.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 tests/generic/030 | 65 ++
 tests/generic/030.out | 78 +++
 tests/generic/group   |  1 +
 3 files changed, 144 insertions(+)
 create mode 100644 tests/generic/030
 create mode 100644 tests/generic/030.out

diff --git a/tests/generic/030 b/tests/generic/030
new file mode 100644
index 000..7cbfa88
--- /dev/null
+++ b/tests/generic/030
@@ -0,0 +1,65 @@
+#! /bin/bash
+# FS QA Test No. generic/030
+#
+# Delayed allocation insert range tests
+# This testcase is one of the 4 testcases which tries to
+# test various corner cases for finsert range functionality over different
+# type of extents. These tests are based on generic/255 test case.
+# For the type of tests, check the description of _test_generic_punch
+# in common/rc.
+#---
+# Copyright (c) 2013 Samsung Electronics.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+# we need to include common/punch to get defination fo filter functions
+. ./common/rc
+. ./common/filter
+. ./common/punch
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+
+_require_xfs_io_command "fpunch"
+_require_xfs_io_command "falloc"
+_require_xfs_io_command "fiemap"
+_require_xfs_io_command "finsert"
+
+testfile=$TEST_DIR/$seq.$$
+
+_test_generic_punch -d falloc fpunch finsert fiemap _filter_hole_fiemap 
$testfile
+_check_test_fs
+
+status=0
+exit
diff --git a/tests/generic/030.out b/tests/generic/030.out
new file mode 100644
index 000..5812622
--- /dev/null
+++ b/tests/generic/030.out
@@ -0,0 +1,78 @@
+QA output created by 030
+   1. into a hole
+cf845a781c107ec1346e849c9dd1b7e8
+   2. into allocated space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+64e72217eebcbdf31b1b058f9f5f476a
+   3. into unwritten space
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..55]: extent
+cf845a781c107ec1346e849c9dd1b7e8
+   4. hole -> data
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+   5. hole -> unwritten
+0: [0..31]: hole
+1: [32..47]: extent
+2: [48..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+   6. data -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+be0f35d4292a20040766d87883b0abd1
+   7. data -> unwritten
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+be0f35d4292a20040766d87883b0abd1
+   8. unwritten -> hole
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..31]: extent
+3: [32..55]: hole
+cf845a781c107ec1346e849c9dd1b7e8
+   9. unwritten -> data
+0: [0..7]: extent
+1: [8..23]: hole
+2: [24..47]: extent
+3: [48..55]: hole
+adb08a6d94a3b5eff90fdfebb2366d31
+   10. hole -> data -> hole
+0: [0..39]: hole
+1: [40..47]: extent
+2: [48..63]: hole
+0487b3c52810f994c541aa166215375f
+   11. data -> hole -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..39]: extent
+3: [40..47]: hole
+4: [48..63]: extent
+e3a8d52acc4d91a8ed19d7b6f4f26a71
+   12. unwritten -> data -> unwritten
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+0487b3c52810f994c541aa166215375f
+   13. data -> unwritten -> data
+0: [0..7]: extent
+1: [8..31]: hole
+2: [32..63]: extent
+2b22165f4a24a2c36fd05ef00b41df88
+   14. data -> hole @ EOF
+0: [0..23]: extent
+1: [24..39]: hole
+2: [40..55]: extent
+aa0f20d1edcdbce60d8ef82700ba30c3
+   15. data -> hole @ 0
+0: [0..15]: hole
+1: [16..55]: extent
+86c9d033be2761385c9cfa203c426bb2
diff --git a/tests/generic/group b/tests/generic/group
index b1dc921..b847c47 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -32,6 +32,7 @@
 027 auto enospc
 028 auto quick
 029 auto quick prealloc
+030 auto quick prealloc
 053 acl repair auto quick
 062 

[PATCH v5 2/10] xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate

2014-08-07 Thread Namjae Jeon
This patch implements fallocate's FALLOC_FL_INSERT_RANGE for XFS.

1) Make sure that both offset and len are block size aligned.
2) Update the i_size of inode by len bytes.
3) Compute the file's logical block number against offset. If the computed
   block number is not the starting block of the extent, split the extent
   such that the block number is the starting block of the extent.
4) Shift all the extents which are lying bewteen [offset, last allocated extent]
   towards right by len bytes. This step will make a hole of len bytes
   at offset.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
Reviewed-by: Brian Foster 
---
Changelog

v5:
 - remove allocation part.

v4:
 - set cur->bc_private.b.allocated to zero before calling xfs_btree_del_cursor.

v3:
 - remove XFS_TRANS_RESERVE and assert.
 - update the comment of blockcount calculation.
 - use 'if(blockcount)' instead of 'if (got.br_blockcount < blockcount)'.
 - move insert_file_space() calling under xfs_setattr_size to avoid code 
duplicate.

v2:
 - remove reserved enable.
 - add xfs_qm_dqattach.
 - reset blockcount in xfs_bmap_shift_extents_right.
 - update i_size to avoid data loss before insert_file_space() is called.
 - use in-memory extent array size that delayed allocation extents

 fs/xfs/libxfs/xfs_bmap.c | 379 ++-
 fs/xfs/libxfs/xfs_bmap.h |   9 +-
 fs/xfs/xfs_bmap_util.c   | 123 ++-
 fs/xfs/xfs_bmap_util.h   |   2 +
 fs/xfs/xfs_file.c|  38 -
 fs/xfs/xfs_trace.h   |   1 +
 6 files changed, 547 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index de2d26d..62f5aa7 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5413,7 +5413,7 @@ error0:
  * into, this will be considered invalid operation and we abort immediately.
  */
 int
-xfs_bmap_shift_extents(
+xfs_bmap_shift_extents_left(
struct xfs_trans*tp,
struct xfs_inode*ip,
int *done,
@@ -5443,7 +5443,7 @@ xfs_bmap_shift_extents(
(XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_EXTENTS &&
 XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_BTREE),
 mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
-   XFS_ERROR_REPORT("xfs_bmap_shift_extents",
+   XFS_ERROR_REPORT("xfs_bmap_shift_extents_left",
 XFS_ERRLEVEL_LOW, mp);
return -EFSCORRUPTED;
}
@@ -5600,3 +5600,378 @@ del_cursor:
xfs_trans_log_inode(tp, ip, logflags);
return error;
 }
+
+/*
+ * Splits an extent into two extents at split_fsb block that it is
+ * the first block of the current_ext. @current_ext is a target extent
+ * to be splitted. @split_fsb is a block where the extents is spliited.
+ * If split_fsb lies in a hole or the first block of extents, just return 0.
+ */
+STATIC int
+xfs_bmap_split_extent_at(
+   struct xfs_trans*tp,
+   struct xfs_inode*ip,
+   xfs_fileoff_t   split_fsb,
+   xfs_extnum_t*current_ext,
+   xfs_fsblock_t   *firstfsb,
+   struct xfs_bmap_free*free_list)
+{
+   int whichfork = XFS_DATA_FORK;
+   struct xfs_btree_cur*cur;
+   struct xfs_bmbt_rec_host*gotp;
+   struct xfs_bmbt_irecgot;
+   struct xfs_bmbt_irecnew; /* splitted extent */
+   struct xfs_mount*mp = ip->i_mount;
+   struct xfs_ifork*ifp;
+   xfs_fsblock_t   gotblkcnt; /* new block count for got */
+   int error = 0;
+   int logflags;
+   int i = 0;
+
+   if (unlikely(XFS_TEST_ERROR(
+   (XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_EXTENTS &&
+XFS_IFORK_FORMAT(ip, whichfork) != XFS_DINODE_FMT_BTREE),
+mp, XFS_ERRTAG_BMAPIFORMAT, XFS_RANDOM_BMAPIFORMAT))) {
+   XFS_ERROR_REPORT("xfs_bmap_split_extent_at",
+XFS_ERRLEVEL_LOW, mp);
+   return -EFSCORRUPTED;
+   }
+
+   if (XFS_FORCED_SHUTDOWN(mp))
+   return -EIO;
+
+   ASSERT(current_ext != NULL);
+
+   ifp = XFS_IFORK_PTR(ip, whichfork);
+   if (!(ifp->if_flags & XFS_IFEXTENTS)) {
+   /* Read in all the extents */
+   error = xfs_iread_extents(tp, ip, whichfork);
+   if (error)
+   return error;
+   }
+
+   gotp = xfs_iext_bno_to_ext(ifp, split_fsb, current_ext);
+   /*
+* gotp can be null in 2 cases: 1) if there are no extents
+* or 2) split_fsb lies in a hole beyond which there are
+* no extents. Either way, we are done.
+*/
+   if (!gotp)
+   return 0;
+
+   xfs_bmbt_get_all(gotp, );
+
+  

[PATCH v5 0/10] fs: Introduce FALLOC_FL_INSERT_RANGE for fallocate

2014-08-07 Thread Namjae Jeon
In continuation of the work of making the process of non linear editing of
media files faster, we introduce here the new flag FALLOC_FL_INSERT_RANGE
for fallocate.

This flag will work opposite to the FALLOC_FL_COLLAPSE_RANGE flag.
As such, specifying FALLOC_FL_INSERT_RANGE flag will create new space inside 
file
by inserting a hole within the range specified by offset and len. 
User can write new data in this space. e.g. ads.
Like collapse range, currently we have the limitation that offset and len should
be block size aligned for both XFS and Ext4.

The semantics of the flag are :
1) It creates space within file by inserting a hole of  len bytes starting
   at offset byte without overwriting any existing data. All the data blocks
   from offset to EOF are shifted towards right to make hole space.
2) It should be used exclusively. No other fallocate flag in combination.
3) Offset and length supplied to fallocate should be fs block size aligned
   in case of xfs and ext4.
4) Insert range does not work for the case when offset is overlapping/beyond
   i_size. If the user wants to insert space at the end of file they are
   advised to use either ftruncate(2) or fallocate(2) with mode 0.
5) It increses the size of file by len bytes.


Namjae Jeon (10):
 fs: Add support FALLOC_FL_INSERT_RANGE for fallocate
 xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate
 ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate
 xfsprogs: xfs_io: add finsert command for insert range via fallocate
 xfstests: generic/029: Standard insert range tests
 xfstests: generic/030: Delayed allocation insert range
 xfstests: generic/031: Multi insert range tests
 xfstests: generic/032: Delayed allocation multi insert
 xfstests: fsstress: Add fallocate insert range operation
 xfstests: fsx: Add fallocate insert range operation

-- 
1.7.11-rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 1/10] fs: Add support FALLOC_FL_INSERT_RANGE for fallocate

2014-08-07 Thread Namjae Jeon
FALLOC_FL_INSERT_RANGE command is the opposite command of
FALLOC_FL_COLLAPSE_RANGE that is needed for advertisers or someone who want to
add some data in the middle of file. FALLOC_FL_INSERT_RANGE will create space
for writing new data within a file after shifting extents to right as given
length. and this command also has same limitation as FALLOC_FL_COLLAPSE_RANGE,
that is block boundary and use ftruncate(2) for crosses EOF.

Signed-off-by: Namjae Jeon 
Signed-off-by: Ashish Sangwan 
---
 fs/open.c   |  8 +++-
 include/uapi/linux/falloc.h | 15 +++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/fs/open.c b/fs/open.c
index 36662d0..74ed498 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -232,7 +232,8 @@ int do_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
 
/* Return error if mode is not supported */
if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
-FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE))
+FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |
+FALLOC_FL_INSERT_RANGE))
return -EOPNOTSUPP;
 
/* Punch hole and zero range are mutually exclusive */
@@ -250,6 +251,11 @@ int do_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
(mode & ~FALLOC_FL_COLLAPSE_RANGE))
return -EINVAL;
 
+   /* Insert range should only be used exclusively. */
+   if ((mode & FALLOC_FL_INSERT_RANGE) &&
+   (mode & ~FALLOC_FL_INSERT_RANGE))
+   return -EINVAL;
+
if (!(file->f_mode & FMODE_WRITE))
return -EBADF;
 
diff --git a/include/uapi/linux/falloc.h b/include/uapi/linux/falloc.h
index d1197ae..1f20723 100644
--- a/include/uapi/linux/falloc.h
+++ b/include/uapi/linux/falloc.h
@@ -41,4 +41,19 @@
  */
 #define FALLOC_FL_ZERO_RANGE   0x10
 
+/*
+ * FALLOC_FL_INSERT_RANGE is use to insert space within the file size without
+ * overwriting any existing data. The contents of the file beyond offset are
+ * shifted towards right by len bytes to create a hole.  As such, this
+ * operation will increase the size of the file by len bytes.
+ * Different filesystems may implement different limitations on the granularity
+ * of the operation. Most will limit operations to filesystem block size
+ * boundaries, but this boundary may be larger or smaller depending on
+ * the filesystem and/or the configuration of the filesystem or file.
+ * Attempting to insert space using this flag at OR beyond the end of
+ * the file is considered an illegal operation - just use ftruncate(2) or
+ * fallocate(2) with mode 0 for such type of operations.
+ */
+#define FALLOC_FL_INSERT_RANGE 0x20
+
 #endif /* _UAPI_FALLOC_H_ */
-- 
1.7.11-rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] block: fix error return code

2014-08-07 Thread Julia Lawall


On Thu, 7 Aug 2014, Jeff Moyer wrote:

> Julia Lawall  writes:
> 
> > diff --git a/drivers/block/rsxx/core.c b/drivers/block/rsxx/core.c
> > index a8de2ee..fa8077a 100644
> > --- a/drivers/block/rsxx/core.c
> > +++ b/drivers/block/rsxx/core.c
> > @@ -942,6 +942,7 @@ static int rsxx_pci_probe(struct pci_dev *dev,
> > card->event_wq = create_singlethread_workqueue(DRIVER_NAME"_event");
> > if (!card->event_wq) {
> > dev_err(CARD_TO_DEV(card), "Failed card event setup.\n");
> > +   st = -ENOMEM;
> > goto failed_event_handler;
> > }
> 
> Reviewed-by: Jeff Moyer 
> 
> BTW, just above this there is questionable code:
> 
> st = rsxx_get_num_targets(card, >n_targets);
> if (st)
> dev_info(CARD_TO_DEV(card),
> "Failed reading the number of DMA targets\n");
> 
> card->ctrl = kzalloc(card->n_targets * sizeof(*card->ctrl), 
> GFP_KERNEL);
> if (!card->ctrl) {
> st = -ENOMEM;
> goto failed_dma_setup;
> }
> 
> From my reading of the kzalloc code, ZERO_SIZE_PTR (which is 16 cast to
> a void *) would be returned from that kzalloc call if the
> rsxx_get_num_targets call failed (since you'd be kzalloc-ing 0 bytes).
> That would lead to the !card->ctrl check not working, right?
> 
> I'd suggest not continuing after rsxx_get_num_targets fails.

Good point.  I'll fix it up.

julia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


kernel in spinlock or deschduled : packet loss during 1 Mpps transfer

2014-08-07 Thread anshul makkar
Hi,

I am transferring 1 Million packets from send process to receive
process. Both are running in separate cores. I am using Using DMA
transfer from usermode to the ethernet card.

After transfer I can see 700 - 800 packets loss per transaction.

I doubt that during the transfer kernel may be entering some spinlocks
or descheduling the receiver processor for long enough.

Please can you share how can I debug this case. Please share as to
what approach I can take to detect that. The time involved in the
tranfer is very less around 1 Million packets per second and in that
second packet loss is happening.

Thanks
Anshul Makkar
www.justkernel.com
http://www.linkedin.com/groups/Just-Kernel-3033180
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: fault injection caused oops in proc_flush_task

2014-08-07 Thread Eric W. Biederman
Dave Jones  writes:

> Because I don't have enough oopses in my life, I decided to play
> with the fault injection code today. It's not something I hear about
> people trying too often, so I wondered what horrors lurk..
>
> So I ran this..
>
> #!/bin/bash
>
> for FAILTYPE in failslab fail_page_alloc
> do
>  echo N > /sys/kernel/debug/$FAILTYPE/task-filter
>  echo 50 > /sys/kernel/debug/$FAILTYPE/probability
>  echo 500 > /sys/kernel/debug/$FAILTYPE/interval
>  echo -1 > /sys/kernel/debug/$FAILTYPE/times
>  echo 0 > /sys/kernel/debug/$FAILTYPE/space
>  echo 0 > /sys/kernel/debug/$FAILTYPE/verbose
>  echo 2 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
> done
>
> And then ran my usual fuzzing session, and saw this ..
>
> Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU: 2 PID: 8506 Comm: trinity-c124 Not tainted 3.16.0+ #41
> task: 880227fc95e0 ti: 8800929a task.ti: 8800929a
> RIP: 0010:[]  [] 
> proc_flush_task+0x99/0x1b0
> RSP: 0018:8800929a3d40  EFLAGS: 00010246
> RAX: 0001 RBX: 8800929a3d6b RCX: 
> RDX: 8800929a3d6c RSI: 8800929a3d58 RDI: 
> RBP: 8800929a3da8 R08: 000a R09: fffb
> R10:  R11:  R12: 0001
> R13:  R14:  R15: 0002
> FS:  7f019c95d700() GS:88024d10() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 959e7000 CR4: 001407e0
> DR0: 0249e000 DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0600
> Stack:
>  88022ad836c0 0002929a3d58 88022ad836c0 000132313068
>  8800929a3d6b 373231003277b6b0 88022377b600 898577d0
>  88022377b6b0 278f 0010 
> Call Trace:
>  [] release_task+0x4c/0x4a0
>  [] wait_consider_task+0x70a/0xbe0
>  [] do_wait+0x144/0x2d0
>  [] SyS_wait4+0x7b/0x100
>  [] ? task_stopped_code+0x60/0x60
>  [] tracesys+0xdd/0xe2
> Code: d4 e0 a9 86 48 03 45 a8 89 4d a4 44 8b 78 30 48 8b 40 38 44 89 f9 4c 8b 
> a8 40 08 00 00 31 c0 e8 6e 21 0f 00 48 8d 75 b0 89 45 b4 <49> 8b 7d 00 e8 4e 
> 8b fa ff 48 85 c0 49 89 c6 74 18 48 89 c7 e8 
> RIP  [] proc_flush_task+0x99/0x1b0
>
>
> Right before the oops, the last thing fault injection logged wrt that pid 
> was..
>
> FAULT_INJECTION: forcing a failure
> CPU: 0 PID: 8506 Comm: trinity-c124 Not tainted 3.16.0+ #41
>  0032 898577d0 8800929a3b08 86759797
>  86c6a300 8800929a3b28 86358c30 8020
>  8020 8800929a3b38 861c6850 8800929a3b88
> Call Trace:
>  [] dump_stack+0x4e/0x7a
>  [] should_fail+0x100/0x110
>  [] should_failslab+0x40/0x50
>  [] kmem_cache_alloc+0x5e/0x270
>  [] ida_pre_get+0x69/0xf0
>  [] ? proc_fill_super+0xa0/0xa0
>  [] get_anon_bdev+0x39/0x120
>  [] ? proc_fill_super+0xa0/0xa0
>  [] set_anon_super+0x16/0x30
>  [] proc_set_super+0x1d/0x80
>  [] sget+0x33a/0x400
>  [] ? proc_root_lookup+0x40/0x40
>  [] proc_mount+0xa7/0x150
>  [] mount_fs+0x38/0x1c0
>  [] vfs_kern_mount+0x64/0x120
>  [] kern_mount_data+0x19/0x30
>  [] pid_ns_prepare_proc+0x1c/0x30
>  [] alloc_pid+0x474/0x4c0
>  [] ? flush_tlb_mm_range+0x80/0x200
>  [] ? copy_thread+0x11d/0x2c0
>  [] copy_process.part.29+0xab0/0x1be0
>  [] do_fork+0xdd/0x400
>  [] ? preempt_count_sub+0xab/0x100
>  [] ? __this_cpu_preempt_check+0x13/0x20
>  [] SyS_clone+0x16/0x20
>  [] stub_clone+0x69/0x90
>  [] ? tracesys+0xdd/0xe2
>
> Should proc_flush_task just be checking for a NULL upid->ns ?
> Or is there something in the pid_ns_prepare_proc failure path
> that we're failing to undo ?
>
> thoughts?
>
> I don't know how feasible it would be to hit that in real life
> without the fault injection stuff, but an oops can't be the right
> thing to do in any case.

Hmm.

So what we can reconstruct from your data is:

8506 probably unshared a pid namespace.
8506 called fork/clone creating the first process in a pid namespace and
 alloc_pid fails.
8506 calls wait and the task it finds to be waited for is defective
 and proc_flush_task oopses.

*scratches head*

So I don't think the last fault had anything to do with this failure.

My dissassembly puts the code at the call of shrink_dcache_parent but I
don't think my dissambly matches your kernel.  Can you disassemble
proc_flush_task so we can at least see what is failing?


Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/lustre: use rcu_dereference to access rcu protected current->real_parent field

2014-08-07 Thread Greg Kroah-Hartman
On Fri, Aug 08, 2014 at 12:03:20AM -0400, Oleg Drokin wrote:
> Hello!
> 
> On Aug 7, 2014, at 11:49 PM, Greg Kroah-Hartman wrote:
> >> 
> >> This is not a critical bug and in the worst case the code here may
> >> cause miss of statistics counter increase.
> >> This is why I think it is not worth to backport the patch at all.
> > You are right, and if this is just for some random "statistics" file,
> > can we just delete the whole function?
> 
> I hope not!
> This is used all around the client to tally up various operations executed 
> counts.

Why would you do that?  Why would they care?

> The statistic is then used by various userspace monitoring tools.

Why not use the in-kernel monitoring tools instead of creating your own?
What does userspace do with that information?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Request to include Mailbox tree in linux-next

2014-08-07 Thread Stephen Rothwell
Hi Jassi,

On Wed, 6 Aug 2014 12:25:49 +0530 Jassi Brar  wrote:
>
>  The framework for Mailbox has undergone 10 revisions over the last
> one year, which has garnered support in the form of 'Reviewed-by' and
> 'looks good enough to be merged in this window' from people in the CC
> list.
> 
>   Could you please add it to linux-next?
> Tree:   git://git.linaro.org/landing-teams/working/fujitsu/integration.git
> Branch:   mailbox-for-3.17
> Contact:  Jassi Brar 

This is really late for v3.17.  The purpose of linux-next is to
discover interactions between trees before they are pulled into Linus'
tree and to do some cross architecture build checking.  Adding a tree
during the merge window is not very helpful to that.

However if this is really going to be merged by Linus before v3.17-rc1,
I will add it on Monday, OK?

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


btrfs BUG() in __set_extent_bit on GFP_ATOMIC failure.

2014-08-07 Thread Dave Jones
While playing with fault injection, I hit this quite easily.

kernel BUG at fs/btrfs/extent_io.c:990!
invalid opcode:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 1 PID: 1270 Comm: fsx Not tainted 3.16.0+ #41
task: 88023fe46d60 ti: 8802405a8000 task.ti: 8802405a8000
RIP: 0010:[]  [] 
__set_extent_bit+0x574/0x660 [btrfs]
...
 [] ? set_track+0x9c/0x140
 [] lock_extent_bits+0x94/0x310 [btrfs]
 [] ? pagecache_get_page+0xb4/0x210
 [] lock_and_cleanup_extent_if_need+0xee/0x1f0 [btrfs]
 [] __btrfs_buffered_write+0x1b1/0x680 [btrfs]
 [] ? preempt_count_sub+0xab/0x100
 [] btrfs_file_write_iter+0x17e/0x570 [btrfs]
 [] new_sync_write+0x8e/0xd0
 [] vfs_write+0xb7/0x1f0
 [] SyS_write+0x58/0xd0
 [] tracesys+0xdd/0xe2


 989 prealloc = alloc_extent_state_atomic(prealloc);
 990 BUG_ON(!prealloc);

 541 static struct extent_state *
 542 alloc_extent_state_atomic(struct extent_state *prealloc)
 543 {
 544 if (!prealloc)
 545 prealloc = alloc_extent_state(GFP_ATOMIC);
 546 
 547 return prealloc;
 548 }


Going BUG() on a GFP_ATOMIC allocation failure seems a bit excessive.
Surely there's something better we can do here ?

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] Hibernate: check unsafe page should not in e820 reserved region

2014-08-07 Thread joeyli
On Thu, Aug 07, 2014 at 11:05:33PM +0200, Pavel Machek wrote:
> On Thu 2014-08-07 18:17:34, joeyli wrote:
> > On Thu, Aug 07, 2014 at 11:39:57AM +0200, Pavel Machek wrote:
> > > > > Actually, if you are doing such a check... it makes sense to check for
> > > > > _all_ the regions, nosave or not. If e820 map changed at all, it is
> > > > > not safe to resume.
> > > > >   
> > > > > Pavel
> > > > 
> > > > Currently nosave region only called register by e820 code, so 
> > > > hibernate's nosave region included e820
> > > > reserved, ACPI data and ACPI NVS region.
> > > > 
> > > > I thought hashing the start/end pfn of above regions is enough.
> > > 
> > > If ammount of memory changed, for example, it is unsafe to
> > > resume. So if you are doing the check, anyway, please hash
> > > whole e820 table.
> > 
> > There already have num_physpages in header for check the total physical 
> > page number.
> 
> Good, but if ammount of memory stayed the same, but offsets
> changed (for example), resume is unsafe, too.

I agreed

> 
> When I wrote that num_physpages check, I should have checked
> whole e820  table, instead. (If anything at all changed there,
> "new" kernel is running with wrong e820 info).
> 
> You seem to be in great position to fix that mistake now...
> 
>   Pavel

Hashing e820 is fine, but it can not provide detail information to 
user/developer when issue
happened. We only know the e820 table changed but not more information for bug 
tracking, not
too many shipping machine have serial console.

After checked the space of swsusp_info, I hope can store whole e820 table to 
snapshot header
for compare the range. The maximum space consumed 20 * E820MAX = 20 * 128 = 
2560 bytes.
Currrent swsusp_info is used 430 bytes in one page, it's enough to us for keep 
whole e820
table.

And, I thought don't need compare the range of E820_RAM and E820_RESERVED_KERN 
type
because they are using by OS and stored in snapshot image.


Thanks a lot!
Joey Lee
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v2] kprobes: arm: enable OPTPROBES for ARM 32

2014-08-07 Thread Wang Nan
This patch introduce kprobeopt for ARM 32.

Limitations:
 - Currently only kernel compiled with ARM ISA is supported.

 - Offset between probe point and optinsn slot must not larger than
   32MiB. Masami Hiramatsu suggests replacing 2 words, it will make
   things complex. Futher patch can make such optimization.

Kprobe opt on ARM is relatively simpler than kprobe opt on x86 because
ARM instruction is always 4 bytes aligned and 4 bytes long. This patch
replace probed instruction by a 'b', branch to trampoline code and then
calls optimized_callback(). optimized_callback() calls opt_pre_handler()
to execute kprobe handler. It also emulate/simulate replaced instruction.

When unregistering kprobe, the deferred manner of unoptimizer may leave
branch instruction before optimizer is called. Different from x86_64,
which only copy the probed insn after optprobe_template_end and
reexecute them, this patch call singlestep to emulate/simulate the insn
directly. Futher patch can optimize this behavior.

v1 -> v2:

 - Improvement: if replaced instruction is conditional, generate a
   conditional branch instruction for it;

 - Introduces RELATIVEJUMP_OPCODES due to ARM kprobe_opcode_t is 4
   bytes;

 - Removes size field in struct arch_optimized_insn;

 - Use arm_gen_branch() to generate branch instruction;

 - Remove all recover logic: ARM doesn't use tail buffer, no need
   recover replaced instructions like x86;

 - Remove incorrect CONFIG_THUMB checking;

 - can_optimize() always returns true if address is well aligned;

 - Improve optimized_callback: using opt_pre_handler();

 - Bugfix: correct range checking code and improve comments;

 - Fix commit message.

Signed-off-by: Wang Nan 
Cc: Masami Hiramatsu 
Cc: Jon Medhurst (Tixy) 
Cc: Russell King - ARM Linux 
---
 arch/arm/Kconfig   |   1 +
 arch/arm/include/asm/kprobes.h |  26 +
 arch/arm/kernel/Makefile   |   3 +-
 arch/arm/kernel/kprobes-opt.c  | 257 +
 4 files changed, 286 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/kernel/kprobes-opt.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 290f02ee..2106918 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -57,6 +57,7 @@ config ARM
select HAVE_MEMBLOCK
select HAVE_MOD_ARCH_SPECIFIC if ARM_UNWIND
select HAVE_OPROFILE if (HAVE_PERF_EVENTS)
+   select HAVE_OPTPROBES if (!THUMB2_KERNEL)
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
diff --git a/arch/arm/include/asm/kprobes.h b/arch/arm/include/asm/kprobes.h
index 49fa0df..caacc7c 100644
--- a/arch/arm/include/asm/kprobes.h
+++ b/arch/arm/include/asm/kprobes.h
@@ -51,5 +51,31 @@ int kprobe_fault_handler(struct pt_regs *regs, unsigned int 
fsr);
 int kprobe_exceptions_notify(struct notifier_block *self,
 unsigned long val, void *data);
 
+/* optinsn template addresses */
+extern __visible kprobe_opcode_t optprobe_template_entry;
+extern __visible kprobe_opcode_t optprobe_template_val;
+extern __visible kprobe_opcode_t optprobe_template_call;
+extern __visible kprobe_opcode_t optprobe_template_end;
+
+#define MAX_OPTIMIZED_LENGTH   (4)
+#define MAX_OPTINSN_SIZE   \
+   (((unsigned long)_template_end -   \
+ (unsigned long)_template_entry))
+#define RELATIVEJUMP_SIZE  (4)
+#define RELATIVEJUMP_OPCODES   ((RELATIVEJUMP_SIZE) / sizeof(kprobe_opcode_t))
+
+struct arch_optimized_insn {
+   /*
+* copy of the original instructions.
+* Different from x86, ARM kprobe_opcode_t is u32.
+*/
+   kprobe_opcode_t copied_insn[RELATIVEJUMP_OPCODES];
+   /* detour code buffer */
+   kprobe_opcode_t *insn;
+   /*
+*  we always copies one instruction on arm32,
+*  size always be 4, so no size field.
+*/
+};
 
 #endif /* _ARM_KPROBES_H */
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f..6a38ec1 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -52,11 +52,12 @@ obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o insn.o
 obj-$(CONFIG_JUMP_LABEL)   += jump_label.o insn.o patch.o
 obj-$(CONFIG_KEXEC)+= machine_kexec.o relocate_kernel.o
 obj-$(CONFIG_UPROBES)  += probes.o probes-arm.o uprobes.o uprobes-arm.o
-obj-$(CONFIG_KPROBES)  += probes.o kprobes.o kprobes-common.o patch.o
+obj-$(CONFIG_KPROBES)  += probes.o kprobes.o kprobes-common.o patch.o 
insn.o
 ifdef CONFIG_THUMB2_KERNEL
 obj-$(CONFIG_KPROBES)  += kprobes-thumb.o probes-thumb.o
 else
 obj-$(CONFIG_KPROBES)  += kprobes-arm.o probes-arm.o
+obj-$(CONFIG_OPTPROBES)+= kprobes-opt.o
 endif
 obj-$(CONFIG_ARM_KPROBES_TEST) += test-kprobes.o
 test-kprobes-objs  := kprobes-test.o
diff --git a/arch/arm/kernel/kprobes-opt.c b/arch/arm/kernel/kprobes-opt.c
new file mode 100644
index 000..3b721e7
--- 

Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-07 Thread Jason Low
On Fri, 2014-08-08 at 02:02 +0800, Yuyang Du wrote:
> On Wed, Aug 06, 2014 at 11:21:35AM -0700, Jason Low wrote:
> > I ran these tests with most of the AIM7 workloads to compare its
> > performance between a 3.16 kernel and the kernel with these patches
> > applied.
> > 
> > The table below contains the percent difference between the baseline
> > kernel and the kernel with the patches at various user counts. A
> > positive percent means the kernel with the patches performed better,
> > while a negative percent means the baseline performed better.
> > 
> > Based on these numbers, for many of the workloads, the change was
> > beneficial in those highly contended, while it had - impact in many
> > of the lightly/moderately contended case (10 to 90 users).
> > 
> > -
> >   |   10-90   |  100-1000   |  1100-2000
> >   |   users   |   users |   users
> > -
> > alltests  |   -3.37%  |  -10.64%|   -2.25%
> > -
> > all_utime |   +0.33%  |   +3.73%|   +3.33%
> > -
> > compute   |   -5.97%  |   +2.34%|   +3.22%
> > -
> > custom|  -31.61%  |  -10.29%|  +15.23%
> > -
> > disk  |  +24.64%  |  +28.96%|  +21.28%
> > -
> > fserver   |   -1.35%  |   +4.82%|   +9.35%
> > -
> > high_systime  |   -6.73%  |   -6.28%|  +12.36%
> > -
> > shared|  -28.31%  |  -19.99%|   -7.10%
> > -
> > short |  -44.63%  |  -37.48%|  -33.62%
> > -
> > 
> Thanks, Jason. Sorry for late response.
> 
> What about the variation of the tests? The machine you test on?

Hi Yuyang,

These tests were also done on an 8 socket machine (80 cores). In terms
of variation between the average throughputs, typically the noise range
is about 2% in many of the workloads.

Jason

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/lustre: use rcu_dereference to access rcu protected current->real_parent field

2014-08-07 Thread Oleg Drokin
Hello!

On Aug 7, 2014, at 11:49 PM, Greg Kroah-Hartman wrote:
>> 
>> This is not a critical bug and in the worst case the code here may
>> cause miss of statistics counter increase.
>> This is why I think it is not worth to backport the patch at all.
> You are right, and if this is just for some random "statistics" file,
> can we just delete the whole function?

I hope not!
This is used all around the client to tally up various operations executed 
counts.

The statistic is then used by various userspace monitoring tools.

Bye,
Oleg--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] staging:r819xU: coding style: Fixed commenting style

2014-08-07 Thread Sharma, Sanjeev
Opps, I forget.

Let me correct and send V2 patch.

Regards
Sanjeev Sharma

-Original Message-
From: Greg KH [mailto:gre...@linuxfoundation.org] 
Sent: Thursday, August 07, 2014 9:33 PM
To: Sharma, Sanjeev
Cc: de...@driverdev.osuosl.org; oor...@gmail.com; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] staging:r819xU: coding style: Fixed commenting style

On Thu, Aug 07, 2014 at 12:15:57PM +0530, Sanjeev Sharma wrote:
> This is a patch to the r819xU_phyreg.h file that fixes commenting 
> style warning
> 
> Signed-off-by: Sanjeev Sharma 
> ---
>  drivers/staging/rtl8192u/r819xU_phyreg.h | 188 
> ---
>  1 file changed, 97 insertions(+), 91 deletions(-)
> 
> diff --git a/drivers/staging/rtl8192u/r819xU_phyreg.h 
> b/drivers/staging/rtl8192u/r819xU_phyreg.h
> index 64285d6..f07d2f1 100644
> --- a/drivers/staging/rtl8192u/r819xU_phyreg.h
> +++ b/drivers/staging/rtl8192u/r819xU_phyreg.h
> @@ -2,10 +2,10 @@
>  #define _R819XU_PHYREG_H
>  
>  
> -#define   RF_DATA0x1d4   
> // FW will write RF data in the register.
> +#define   RF_DATA0x1d4   
> /* FW will write RF data in the register.*/
>  
> -//Register   //duplicate register due to connection: RF_Mode, TRxRN, NumOf 
> L-STF
> -//page 1
> +/* Register   //duplicate register due to connection: RF_Mode, TRxRN, NumOf 
> L-STF */

Does that line look correct?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] staging:r819xU: coding style: Fixed commenting style

2014-08-07 Thread Sanjeev Sharma
This is a patch to the r819xU_phyreg.h file that fixes
commenting style warning

Signed-off-by: Sanjeev Sharma 
---
 drivers/staging/rtl8192u/r819xU_phyreg.h | 189 ---
 1 file changed, 98 insertions(+), 91 deletions(-)

diff --git a/drivers/staging/rtl8192u/r819xU_phyreg.h 
b/drivers/staging/rtl8192u/r819xU_phyreg.h
index 64285d6..b855627 100644
--- a/drivers/staging/rtl8192u/r819xU_phyreg.h
+++ b/drivers/staging/rtl8192u/r819xU_phyreg.h
@@ -2,10 +2,11 @@
 #define _R819XU_PHYREG_H
 
 
-#define   RF_DATA  0x1d4   
// FW will write RF data in the register.
+#define   RF_DATA  0x1d4   
/* FW will write RF data in the register.*/
 
-//Register   //duplicate register due to connection: RF_Mode, TRxRN, NumOf 
L-STF
-//page 1
+/* Register duplicate register due to connection: RF_Mode, TRxRN, NumOf L-STF
+ * page 1
+ */
 #define rPMAC_Reset0x100
 #define rPMAC_TxStart  0x104
 #define rPMAC_TxLegacySIG  0x108
@@ -34,15 +35,16 @@
 #define rPMAC_CCKCRxRC32OK 0x188
 #define rPMAC_TxStatus 0x18c
 
-//page8
-#define rFPGA0_RFMOD   0x800  //RF mode & CCK TxSC
+/* page8 */
+#define rFPGA0_RFMOD   0x800  /* RF mode & CCK TxSC */
 #define rFPGA0_TxInfo  0x804
 #define rFPGA0_PSDFunction 0x808
 #define rFPGA0_TxGainStage 0x80c
 #define rFPGA0_RFTiming1   0x810
 #define rFPGA0_RFTiming2   0x814
-//#define rFPGA0_XC_RFTiming   0x818
-//#define rFPGA0_XD_RFTiming   0x81c
+/* #define rFPGA0_XC_RFTiming  0x818
+ * #define rFPGA0_XD_RFTiming  0x81c
+ */
 #define rFPGA0_XA_HSSIParameter1   0x820
 #define rFPGA0_XA_HSSIParameter2   0x824
 #define rFPGA0_XB_HSSIParameter1   0x828
@@ -79,51 +81,51 @@
 #define rFPGA0_XAB_RFInterfaceRB   0x8e0
 #define rFPGA0_XCD_RFInterfaceRB   0x8e4
 
-//page 9
-#define rFPGA1_RFMOD   0x900  //RF mode & OFDM TxSC
+/* page 9 */
+#define rFPGA1_RFMOD   0x900  /* RF mode & OFDM TxSC */
 #define rFPGA1_TxBlock 0x904
 #define rFPGA1_DebugSelect 0x908
 #define rFPGA1_TxInfo  0x90c
 
-//page a
+/* page a */
 #define rCCK0_System   0xa00
 #define rCCK0_AFESetting   0xa04
 #define rCCK0_CCA  0xa08
-#define rCCK0_RxAGC1   0xa0c  //AGC default value, 
saturation level
-#define rCCK0_RxAGC2   0xa10  //AGC & DAGC
+#define rCCK0_RxAGC1   0xa0c  /* AGC default value, 
saturation level */
+#define rCCK0_RxAGC2   0xa10  /* AGC & DAGC */
 #define rCCK0_RxHP 0xa14
-#define rCCK0_DSPParameter10xa18  //Timing recovery & Channel 
estimation threshold
-#define rCCK0_DSPParameter20xa1c  //SQ threshold
+#define rCCK0_DSPParameter10xa18  /* Timing recovery & Channel 
estimation threshold */
+#define rCCK0_DSPParameter20xa1c  /* SQ threshold */
 #define rCCK0_TxFilter10xa20
 #define rCCK0_TxFilter20xa24
-#define rCCK0_DebugPort0xa28  //debug port and 
Tx filter3
-#define rCCK0_FalseAlarmReport 0xa2c  //0xa2d
+#define rCCK0_DebugPort0xa28  /* debug port 
and Tx filter3 */
+#define rCCK0_FalseAlarmReport 0xa2c  /* 0xa2d */
 #define rCCK0_TRSSIReport  0xa50
-#define rCCK0_RxReport 0xa54  //0xa57
-#define rCCK0_FACounterLower   0xa5c  //0xa5b
-#define rCCK0_FACounterUpper   0xa58  //0xa5c
+#define rCCK0_RxReport 0xa54  /* 0xa57 */
+#define rCCK0_FACounterLower   0xa5c  /* 0xa5b */
+#define rCCK0_FACounterUpper   0xa58  /* 0xa5c */
 
-//page c
+/* page c */
 #define rOFDM0_LSTF0xc00
 #define rOFDM0_TRxPathEnable   0xc04
 #define rOFDM0_TRMuxPar0xc08
 #define rOFDM0_TRSWIsolation   0xc0c
-#define rOFDM0_XARxAFE 0xc10  //RxIQ DC offset, Rx 
digital filter, DC notch filter
-#define rOFDM0_XARxIQImbalance 0xc14  //RxIQ imblance matrix
+#define rOFDM0_XARxAFE 0xc10  /* RxIQ DC offset, Rx 
digital filter, DC notch filter */
+#define rOFDM0_XARxIQImbalance 0xc14  /* RxIQ imblance matrix */
 #define rOFDM0_XBRxAFE 0xc18
 #define rOFDM0_XBRxIQImbalance 0xc1c
 #define 

[PATCH v5 3/4] ARM: dts: add rk3288 dwc2 controller support

2014-08-07 Thread Kever Yang
rk3288 has two kind of usb controller, this add the dwc2 controller
for otg and host1.

Controller can works with usb PHY default setting and Vbus on.

Signed-off-by: Kever Yang 
Reviewed-by: Doug Anderson 
Tested-by: Doug Anderson 
---

Changes in v5:
- change the sort order of dwc2 in rk3288.dtsi

Changes in v4: None
Changes in v3:
- EHCI and HSIC move new for version 3.

Changes in v2: None

 arch/arm/boot/dts/rk3288.dtsi | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index 5950b0a..58167f1 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -206,6 +206,26 @@
 
/* NOTE: ohci@ff52 doesn't actually work on hardware */
 
+   usb_host1: usb@ff54 {
+   compatible = "rockchip,rk3288-usb", "rockchip,rk3066-usb",
+   "snps,dwc2";
+   reg = <0xff54 0x4>;
+   interrupts = ;
+   clocks = < HCLK_USBHOST1>;
+   clock-names = "otg";
+   status = "disabled";
+   };
+
+   usb_otg: usb@ff58 {
+   compatible = "rockchip,rk3288-usb", "rockchip,rk3066-usb",
+   "snps,dwc2";
+   reg = <0xff58 0x4>;
+   interrupts = ;
+   clocks = < HCLK_OTG0>;
+   clock-names = "otg";
+   status = "disabled";
+   };
+
usb_hsic: usb@ff5c {
compatible = "generic-ehci";
reg = <0xff5c 0x100>;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 1/4] Documentation: dt-bindings: add dt binding info for Rockchip dwc2

2014-08-07 Thread Kever Yang
This add necessary dwc2 binding documentation for Rockchip socs:
rk3066, rk3188 and rk3288

Signed-off-by: Kever Yang 
Acked-by: Stephen Warren 
---

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2:
- Split out dr_mode and rk3288 bindings.
- add compatible "snps,dwc2" bingding info

 Documentation/devicetree/bindings/usb/dwc2.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/usb/dwc2.txt 
b/Documentation/devicetree/bindings/usb/dwc2.txt
index 467ddd1..2899679 100644
--- a/Documentation/devicetree/bindings/usb/dwc2.txt
+++ b/Documentation/devicetree/bindings/usb/dwc2.txt
@@ -4,6 +4,9 @@ Platform DesignWare HS OTG USB 2.0 controller
 Required properties:
 - compatible : One of:
   - brcm,bcm2835-usb: The DWC2 USB controller instance in the BCM2835 SoC.
+  - rockchip,rk3066-usb: The DWC2 USB controller instance in the rk3066 Soc;
+  - "rockchip,rk3188-usb", "rockchip,rk3066-usb", "snps,dwc2": for rk3188 Soc;
+  - "rockchip,rk3288-usb", "rockchip,rk3066-usb", "snps,dwc2": for rk3288 Soc;
   - snps,dwc2: A generic DWC2 USB controller with default parameters.
 - reg : Should contain 1 register range (address and length)
 - interrupts : Should contain 1 interrupt
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 2/4] usb: dwc2: add compatible data for rockchip soc

2014-08-07 Thread Kever Yang
This patch add compatible data for dwc2 controller found on
rk3066, rk3188 and rk3288 processors from rockchip.

Signed-off-by: Kever Yang 
Acked-by: Paul Zimmerman 
---

Changes in v5:
- max_transfer_size change to 65535 to met the requirement of
  header file

Changes in v4:
- max_transfer_size change to 65536, this should be enough
  for most transfer, the hardware auto-detect will set this
  to 0x7 which may make dma_alloc_coherent fail when
  non-dword aligned buf from driver like usbnet happen.

Changes in v3: None
Changes in v2:
- set most parameters as driver auto-detect

 drivers/usb/dwc2/platform.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index a10e7a3..2f859bd 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -75,6 +75,34 @@ static const struct dwc2_core_params params_bcm2835 = {
.uframe_sched   = 0,
 };
 
+static const struct dwc2_core_params params_rk3066 = {
+   .otg_cap= 2,/* non-HNP/non-SRP */
+   .otg_ver= -1,
+   .dma_enable = -1,
+   .dma_desc_enable= 0,
+   .speed  = -1,
+   .enable_dynamic_fifo= 1,
+   .en_multiple_tx_fifo= -1,
+   .host_rx_fifo_size  = 520,  /* 520 DWORDs */
+   .host_nperio_tx_fifo_size   = 128,  /* 128 DWORDs */
+   .host_perio_tx_fifo_size= 256,  /* 256 DWORDs */
+   .max_transfer_size  = 65535,
+   .max_packet_count   = -1,
+   .host_channels  = -1,
+   .phy_type   = -1,
+   .phy_utmi_width = -1,
+   .phy_ulpi_ddr   = -1,
+   .phy_ulpi_ext_vbus  = -1,
+   .i2c_enable = -1,
+   .ulpi_fs_ls = -1,
+   .host_support_fs_ls_low_power   = -1,
+   .host_ls_low_power_phy_clk  = -1,
+   .ts_dline   = -1,
+   .reload_ctl = -1,
+   .ahbcfg = 0x7, /* INCR16 */
+   .uframe_sched   = -1,
+};
+
 /**
  * dwc2_driver_remove() - Called when the DWC_otg core is unregistered with the
  * DWC_otg driver
@@ -97,6 +125,7 @@ static int dwc2_driver_remove(struct platform_device *dev)
 
 static const struct of_device_id dwc2_of_match_table[] = {
{ .compatible = "brcm,bcm2835-usb", .data = _bcm2835 },
+   { .compatible = "rockchip,rk3066-usb", .data = _rk3066 },
{ .compatible = "snps,dwc2", .data = NULL },
{},
 };
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 4/4] ARM: dts: Enable USB host1(dwc) on rk3288-evb

2014-08-07 Thread Kever Yang
USB host1 port is the host A port nearby the otg port.

Signed-off-by: Kever Yang 

---

Changes in v5:
- don't enable otg port for evb

Changes in v4: None
Changes in v3:
- Rebase

Changes in v2:
- evb patch added in version 2

 arch/arm/boot/dts/rk3288-evb.dtsi | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/rk3288-evb.dtsi 
b/arch/arm/boot/dts/rk3288-evb.dtsi
index 4f57209..4b62df6 100644
--- a/arch/arm/boot/dts/rk3288-evb.dtsi
+++ b/arch/arm/boot/dts/rk3288-evb.dtsi
@@ -94,3 +94,7 @@
 _host0_ehci {
status = "okay";
 };
+
+_host1 {
+   status = "okay";
+};
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/4] Patches to add support for Rockchip dwc2 controller

2014-08-07 Thread Kever Yang
These patches to add support for dwc2 controller found in
Rockchip processors rk3066, rk3188 and rk3288,
and enable dts for rk3288 evb.

Changes in v5:
- max_transfer_size change to 65535 to met the requirement of
  header file
- change the sort order of dwc2 in rk3288.dtsi
- don't enable otg port for evb

Changes in v4:
- max_transfer_size change to 65536, this should be enough
  for most transfer, the hardware auto-detect will set this
  to 0x7 which may make dma_alloc_coherent fail when
  non-dword aligned buf from driver like usbnet happen.
- remove EHCI and HSIC dts patch for Doug had post it seprately.

Changes in v3:
- EHCI and HSIC move new for version 3.
- Rebase

Changes in v2:
- Split out dr_mode and rk3288 bindings.
- add compatible "snps,dwc2" bingding info
- set most parameters as driver auto-detect
- evb patch added in version 2

Kever Yang (4):
  Documentation: dt-bindings: add dt binding info for Rockchip dwc2
  usb: dwc2: add compatible data for rockchip soc
  ARM: dts: add rk3288 dwc2 controller support
  ARM: dts: Enable USB host1(dwc) on rk3288-evb

 Documentation/devicetree/bindings/usb/dwc2.txt |  3 +++
 arch/arm/boot/dts/rk3288-evb.dtsi  |  4 
 arch/arm/boot/dts/rk3288.dtsi  | 20 ++
 drivers/usb/dwc2/platform.c| 29 ++
 4 files changed, 56 insertions(+)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging/lustre: use rcu_dereference to access rcu protected current->real_parent field

2014-08-07 Thread Greg Kroah-Hartman
On Thu, Aug 07, 2014 at 02:13:50PM +0300, Evgeny Budilovsky wrote:
> On Thu, Aug 7, 2014 at 12:42 AM, Greg Kroah-Hartman
>  wrote:
> > On Wed, Aug 06, 2014 at 09:22:43PM +0300, Evgeny Budilovsky wrote:
> >>
> >>
> >> Signed-off-by: Evgeny Budilovsky 
> >
> > Why is this needed?  Is the current code a bug?  Where was the reference
> > added?  Is this causing a problem without this patch applied?  How far
> > back should it be backported, if at all?
> >
> > I need lots more details here before I can take this patch, sorry.
> 
> Sorry for the little information in the previous mail.
> 
> The motivation for this patch was to clean some of the warnings that
> were generated
> on drivers/staging by the sparse utility.
> 
> For this particular case the warning was
> staging/lustre/lustre/llite/lproc_llite.c:913:51: warning: dereference
> of noderef expression
> 
> And this is since current->real_parent is accessed directly and not
> trough the rcu_dereference,
> which is the common way to access it throughout the kernel.
> 
> This is not a critical bug and in the worst case the code here may
> cause miss of statistics counter increase.
> This is why I think it is not worth to backport the patch at all.

You are right, and if this is just for some random "statistics" file,
can we just delete the whole function?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


3.15.8 USB issue with uvc cam

2014-08-07 Thread Udo van den Heuvel
Hello,

I get WARNINGs when trying to use a Logitech C615 cam.
See attachment for full dmesg of errors but excerpt below:

[80346.835015] xhci_hcd :02:00.0: ERROR: unexpected command
completion code 0x11.
[80346.835027] usb 6-2: Not enough bandwidth for altsetting 11
[80346.835137] [ cut here ]
[80346.835155] WARNING: CPU: 3 PID: 20594 at
drivers/media/v4l2-core/videobuf2-core.c:2011
__vb2_queue_cancel+0x102/0x170 [videobuf2_core]()
[80346.835158] Modules linked in: uvcvideo cdc_acm bnep bluetooth fuse
edac_core cpufreq_userspace ipt_REJECT nf_conntrack_netbios_ns
nf_conntrack_broadcast iptable_filter ip6t_REJECT ipt_MASQUERADE
xt_tcpudp nf_conntrack_ipv6 iptable_nat nf_defrag_ipv6 nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack ip_tables
ip6table_filter ip6_tables x_tables eeprom it87 hwmon_vid ext2
snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi ppdev pwc
videobuf2_vmalloc videobuf2_memops kvm_amd kvm v4l2_common
videobuf2_core snd_hda_codec_realtek snd_hda_codec_generic videodev
snd_hda_intel snd_hda_controller cp210x snd_hda_codec usbserial snd_seq
snd_seq_device microcode snd_pcm parport_serial parport_pc parport
snd_timer k10temp snd evdev i2c_piix4 button acpi_cpufreq nfsd
auth_rpcgss oid_registry
[80346.835218]  nfs_acl lockd sunrpc binfmt_misc autofs4 hid_generic
usbhid ohci_pci ehci_pci ehci_hcd ohci_hcd radeon sr_mod cdrom fbcon
bitblit cfbfillrect softcursor cfbimgblt cfbcopyarea font i2c_algo_bit
xhci_hcd backlight drm_kms_helper ttm drm fb fbdev
[80346.835250] CPU: 3 PID: 20594 Comm: skype Tainted: GW
3.15.8 #6
[80346.835254] Hardware name: Gigabyte Technology Co., Ltd. To be filled
by O.E.M./F2A85X-UP4, BIOS F5a 04/30/2013
[80346.835257]   79d580f4 814f2373

[80346.835262]  81069fe1  88040ec532e8

[80346.835267]  88040ec530d8 8803f0c46f00 a041d832
88040ec530d8
[80346.835272] Call Trace:
[80346.835283]  [] ? dump_stack+0x4a/0x75
[80346.835289]  [] ? warn_slowpath_common+0x81/0xb0
[80346.835299]  [] ? __vb2_queue_cancel+0x102/0x170
[videobuf2_core]
[80346.835307]  [] ? vb2_internal_streamoff+0x1d/0x50
[videobuf2_core]
[80346.835314]  [] ? uvc_queue_enable+0x75/0xb0 [uvcvideo]
[80346.835321]  [] ? uvc_video_enable+0x141/0x1a0
[uvcvideo]
[80346.835327]  [] ? uvc_v4l2_do_ioctl+0xd6f/0x1580
[uvcvideo]
[80346.835339]  [] ? video_usercopy+0x1f0/0x490 [videodev]
[80346.835345]  [] ?
uvc_v4l2_set_streamparm.isra.12+0x1c0/0x1c0 [uvcvideo]
[80346.835352]  [] ? preempt_count_add+0x3f/0x90
[80346.835356]  [] ? _raw_spin_lock+0xe/0x30
[80346.835360]  [] ? _raw_spin_unlock+0xd/0x30
[80346.835367]  [] ? __pte_alloc+0xce/0x170
[80346.835376]  [] ? v4l2_ioctl+0x11f/0x160 [videodev]
[80346.835386]  [] ? do_video_ioctl+0x246/0x1330
[videodev]
[80346.835392]  [] ? mmap_region+0x15a/0x5a0
[80346.835402]  [] ? v4l2_compat_ioctl32+0x82/0xb8
[videodev]
[80346.835408]  [] ? compat_SyS_ioctl+0x132/0x1120
[80346.835414]  [] ? vm_mmap_pgoff+0xe3/0x120
[80346.835421]  [] ? cstar_dispatch+0x7/0x1a
[80346.835424] ---[ end trace 44e3d272b6c91a71 ]---
[80346.835427] [ cut here ]


What is wrong here?

Kind regards,
Udo


[80346.835015] xhci_hcd :02:00.0: ERROR: unexpected command completion code 
0x11.
[80346.835027] usb 6-2: Not enough bandwidth for altsetting 11
[80346.835137] [ cut here ]
[80346.835155] WARNING: CPU: 3 PID: 20594 at 
drivers/media/v4l2-core/videobuf2-core.c:2011 __vb2_queue_cancel+0x102/0x170 
[videobuf2_core]()
[80346.835158] Modules linked in: uvcvideo cdc_acm bnep bluetooth fuse 
edac_core cpufreq_userspace ipt_REJECT nf_conntrack_netbios_ns 
nf_conntrack_broadcast iptable_filter ip6t_REJECT ipt_MASQUERADE xt_tcpudp 
nf_conntrack_ipv6 iptable_nat nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack ip_tables ip6table_filter 
ip6_tables x_tables eeprom it87 hwmon_vid ext2 snd_usb_audio snd_usbmidi_lib 
snd_hwdep snd_rawmidi ppdev pwc videobuf2_vmalloc videobuf2_memops kvm_amd kvm 
v4l2_common videobuf2_core snd_hda_codec_realtek snd_hda_codec_generic videodev 
snd_hda_intel snd_hda_controller cp210x snd_hda_codec usbserial snd_seq 
snd_seq_device microcode snd_pcm parport_serial parport_pc parport snd_timer 
k10temp snd evdev i2c_piix4 button acpi_cpufreq nfsd auth_rpcgss oid_registry
[80346.835218]  nfs_acl lockd sunrpc binfmt_misc autofs4 hid_generic usbhid 
ohci_pci ehci_pci ehci_hcd ohci_hcd radeon sr_mod cdrom fbcon bitblit 
cfbfillrect softcursor cfbimgblt cfbcopyarea font i2c_algo_bit xhci_hcd 
backlight drm_kms_helper ttm drm fb fbdev
[80346.835250] CPU: 3 PID: 20594 Comm: skype Tainted: GW 3.15.8 #6
[80346.835254] Hardware name: Gigabyte Technology Co., Ltd. To be filled by 
O.E.M./F2A85X-UP4, BIOS F5a 04/30/2013
[80346.835257]   79d580f4 814f2373 

[80346.835262]  

Re: [PATCH] Hyperv: Trigger DHCP renew after host hibernation

2014-08-07 Thread Greg KH
On Fri, Aug 08, 2014 at 03:13:58AM +, Dexuan Cui wrote:
> > -Original Message-
> > From: Richard Weinberger [mailto:richard.weinber...@gmail.com]
> > Sent: Friday, August 8, 2014 6:37 AM
> > To: David Miller; Yue Zhang (OSTC DEV)
> > Cc: o...@aepfle.de; net...@vger.kernel.org; driverdev-
> > de...@linuxdriverproject.org; LKML; Greg KH; jasow...@redhat.com;
> > Haiyang Zhang; KY Srinivasan; Thomas Shao; Dexuan Cui
> > Subject: Re: [PATCH] Hyperv: Trigger DHCP renew after host hibernation
> > 
> > On Mon, Jul 21, 2014 at 11:32 PM, David Miller 
> > wrote:
> > > From: Olaf Hering 
> > > Date: Mon, 21 Jul 2014 11:18:51 +0200
> > >
> > >> On Mon, Jul 21, Richard Weinberger wrote:
> > >>
> > >>> My concern is that 10 seconds is maybe not a the right choice.
> > >>> (As we cannot know all implementations)
> > >>
> > >> Until someone reports an issue with it, 10 is fine. Just like 20 or 666.
> > >
> > > Wrong, this is policy and belongs in userspace.
> > 
> > The "/etc/init.d/network restart" nonsense now hit Linus' tree.
> > Yue, what is your proposal to fix that?
> > 
> > //richard
> 
> Hi Richard and all,
> Sorry for the late response -- actually we have been trying to
> figure out a solution that's acceptable to all.
> 
> IMO the most feasible and need-the-least-change solution may be:
> the hyperv network VSC driver passes the event
> RNDIS_STATUS_NETWORK_CHANGE to the udev daemon?
> 
> In this way, every distro only needs to add a udev rule, which should
> be simple.

No, don't do that, again, act like any other network device, drop the
link and bring it up when it comes back.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the mmc-uh tree

2014-08-07 Thread Stephen Rothwell
Hi all,

On Mon, 28 Jul 2014 14:46:08 +1000 Stephen Rothwell  
wrote:
>
> After merging the mmc-uh tree, today's linux-next build (arm
> multi_v7_defconfig) failed like this:
> 
> drivers/mmc/host/dw_mmc.c: In function 'dw_mci_reset':
> drivers/mmc/host/dw_mmc.c:2262:3: error: implicit declaration of function 
> 'dw_mci_idmac_reset' [-Werror=implicit-function-declaration]
>dw_mci_idmac_reset(host);
>^
> 
> Caused by commit 25f7dadbd982 ("mmc: dw_mmc: change to use recommended
> reset procedure").
> 
> I have used the mmc-uh tree from next-20140725 for today.

Ping.  We are nearly half way through the merge window and there has
been a patch posted for this, but the tree is still broken ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


fault injection caused oops in proc_flush_task

2014-08-07 Thread Dave Jones
Because I don't have enough oopses in my life, I decided to play
with the fault injection code today. It's not something I hear about
people trying too often, so I wondered what horrors lurk..

So I ran this..

#!/bin/bash

for FAILTYPE in failslab fail_page_alloc
do
 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
 echo 50 > /sys/kernel/debug/$FAILTYPE/probability
 echo 500 > /sys/kernel/debug/$FAILTYPE/interval
 echo -1 > /sys/kernel/debug/$FAILTYPE/times
 echo 0 > /sys/kernel/debug/$FAILTYPE/space
 echo 0 > /sys/kernel/debug/$FAILTYPE/verbose
 echo 2 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
done

And then ran my usual fuzzing session, and saw this ..

Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 2 PID: 8506 Comm: trinity-c124 Not tainted 3.16.0+ #41
task: 880227fc95e0 ti: 8800929a task.ti: 8800929a
RIP: 0010:[]  [] proc_flush_task+0x99/0x1b0
RSP: 0018:8800929a3d40  EFLAGS: 00010246
RAX: 0001 RBX: 8800929a3d6b RCX: 
RDX: 8800929a3d6c RSI: 8800929a3d58 RDI: 
RBP: 8800929a3da8 R08: 000a R09: fffb
R10:  R11:  R12: 0001
R13:  R14:  R15: 0002
FS:  7f019c95d700() GS:88024d10() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2:  CR3: 959e7000 CR4: 001407e0
DR0: 0249e000 DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0600
Stack:
 88022ad836c0 0002929a3d58 88022ad836c0 000132313068
 8800929a3d6b 373231003277b6b0 88022377b600 898577d0
 88022377b6b0 278f 0010 
Call Trace:
 [] release_task+0x4c/0x4a0
 [] wait_consider_task+0x70a/0xbe0
 [] do_wait+0x144/0x2d0
 [] SyS_wait4+0x7b/0x100
 [] ? task_stopped_code+0x60/0x60
 [] tracesys+0xdd/0xe2
Code: d4 e0 a9 86 48 03 45 a8 89 4d a4 44 8b 78 30 48 8b 40 38 44 89 f9 4c 8b 
a8 40 08 00 00 31 c0 e8 6e 21 0f 00 48 8d 75 b0 89 45 b4 <49> 8b 7d 00 e8 4e 8b 
fa ff 48 85 c0 49 89 c6 74 18 48 89 c7 e8 
RIP  [] proc_flush_task+0x99/0x1b0


Right before the oops, the last thing fault injection logged wrt that pid was..

FAULT_INJECTION: forcing a failure
CPU: 0 PID: 8506 Comm: trinity-c124 Not tainted 3.16.0+ #41
 0032 898577d0 8800929a3b08 86759797
 86c6a300 8800929a3b28 86358c30 8020
 8020 8800929a3b38 861c6850 8800929a3b88
Call Trace:
 [] dump_stack+0x4e/0x7a
 [] should_fail+0x100/0x110
 [] should_failslab+0x40/0x50
 [] kmem_cache_alloc+0x5e/0x270
 [] ida_pre_get+0x69/0xf0
 [] ? proc_fill_super+0xa0/0xa0
 [] get_anon_bdev+0x39/0x120
 [] ? proc_fill_super+0xa0/0xa0
 [] set_anon_super+0x16/0x30
 [] proc_set_super+0x1d/0x80
 [] sget+0x33a/0x400
 [] ? proc_root_lookup+0x40/0x40
 [] proc_mount+0xa7/0x150
 [] mount_fs+0x38/0x1c0
 [] vfs_kern_mount+0x64/0x120
 [] kern_mount_data+0x19/0x30
 [] pid_ns_prepare_proc+0x1c/0x30
 [] alloc_pid+0x474/0x4c0
 [] ? flush_tlb_mm_range+0x80/0x200
 [] ? copy_thread+0x11d/0x2c0
 [] copy_process.part.29+0xab0/0x1be0
 [] do_fork+0xdd/0x400
 [] ? preempt_count_sub+0xab/0x100
 [] ? __this_cpu_preempt_check+0x13/0x20
 [] SyS_clone+0x16/0x20
 [] stub_clone+0x69/0x90
 [] ? tracesys+0xdd/0xe2

Should proc_flush_task just be checking for a NULL upid->ns ?
Or is there something in the pid_ns_prepare_proc failure path
that we're failing to undo ?

thoughts?

I don't know how feasible it would be to hit that in real life
without the fault injection stuff, but an oops can't be the right
thing to do in any case.

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] Global signal cleanup

2014-08-07 Thread Stephen Rothwell
Hi all,

On Thu, 7 Aug 2014 08:53:56 -1000 Linus Torvalds 
 wrote:
>
> On Wed, Aug 6, 2014 at 9:35 PM, Richard Weinberger  wrote:
> >
> > It would be nice to see these rules written down somewhere.
> 
> The rules have been pretty clear: "don't rebase public trees".
> 
> That's always been the basic rule. There are _exceptions_ when
> rebasing is the right thing to do, and they all boil down to "lesser
> of two evils", but the evils really have to be pretty big.
> 
> Possible reasons to rebase:
> 
>  (a) It's not public yet. You haven't pushed to kernel.org or any
> other public site, and nobody saw you do it.

So this would not be in linux-next, so I don't care :-)

>  (b) You *really* screwed up, and the downsides of rebasing are
> smaller than the downsides of exposing it.
> 
>  As in "oops, that half-way commit doesn't even compile or work at
> all, so leaving it in that state will screw up anybody trying to find
> other bugs with 'git bisect'"
> 
>  At the same time, if you do this just before pushing to me, maybe
> you should take a step back and say "oops, my tree was completely
> broken, maybe I shouldn't push this to Linus just after fixing it".

And this is fine but shouldn't happen just before sending a pull
request (as Linus said).  But may also require informing anyone who
depends on your tree (especially if that other tree is also in
linux-next ... otherwise I could easily end up with both versions).

>  (c) You want to clean things up, and you're not even remotely ready
> to push things upstream, and while people have *seen* your work,
> nobody relies on it or uses it.

And this should not be in linux-next yet, so again I don't care and
shouldn't see it.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


RE: [PATCH] Hyperv: Trigger DHCP renew after host hibernation

2014-08-07 Thread Dexuan Cui
> -Original Message-
> From: Richard Weinberger [mailto:richard.weinber...@gmail.com]
> Sent: Friday, August 8, 2014 6:37 AM
> To: David Miller; Yue Zhang (OSTC DEV)
> Cc: o...@aepfle.de; net...@vger.kernel.org; driverdev-
> de...@linuxdriverproject.org; LKML; Greg KH; jasow...@redhat.com;
> Haiyang Zhang; KY Srinivasan; Thomas Shao; Dexuan Cui
> Subject: Re: [PATCH] Hyperv: Trigger DHCP renew after host hibernation
> 
> On Mon, Jul 21, 2014 at 11:32 PM, David Miller 
> wrote:
> > From: Olaf Hering 
> > Date: Mon, 21 Jul 2014 11:18:51 +0200
> >
> >> On Mon, Jul 21, Richard Weinberger wrote:
> >>
> >>> My concern is that 10 seconds is maybe not a the right choice.
> >>> (As we cannot know all implementations)
> >>
> >> Until someone reports an issue with it, 10 is fine. Just like 20 or 666.
> >
> > Wrong, this is policy and belongs in userspace.
> 
> The "/etc/init.d/network restart" nonsense now hit Linus' tree.
> Yue, what is your proposal to fix that?
> 
> //richard

Hi Richard and all,
Sorry for the late response -- actually we have been trying to
figure out a solution that's acceptable to all.

IMO the most feasible and need-the-least-change solution may be:
the hyperv network VSC driver passes the event
RNDIS_STATUS_NETWORK_CHANGE to the udev daemon?

In this way, every distro only needs to add a udev rule, which should
be simple.

Any comment?

-- Dexuan
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

disable "Call Trace:"

2014-08-07 Thread 積丹尼 Dan Jacobson
Idea: if the user could disable
the "Call Trace:"
seen upon "Kernel panic - not syncing: attempted to kill init!"
he would then be able to see more lines above without having it shoved
off the screen with the useless (to him) call trace.

e.g., using panic=222 call_trace=disabled would give him enough time and room
to read more of what happened from the screen.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/5] tracing: Do not do anything special with tracepoint_string when tracing is disabled

2014-08-07 Thread Nicolas Pitre
On Thu, 7 Aug 2014, Steven Rostedt wrote:

> Because ftrace_events.h is not included when config tracing is not
> enabled, I got error messages when compiling arm and arm64 without
> tracing enabled. This is the new patch I'm now testing that moves the
> tracepoint_string code to include/linux/tracepoint.h as well.

Makes sense.



> 
> -- Steve
> 
> From 3c49b52b155d0f723792377e1a4480a0e7ca0ba2 Mon Sep 17 00:00:00 2001
> From: Steven Rostedt 
> Date: Fri, 25 Jul 2014 16:05:29 -0400
> Subject: [PATCH] tracing: Do not do anything special with tracepoint_string
>  when tracing is disabled
> 
> When CONFIG_TRACING is not enabled, there's no reason to save the trace
> strings either by the linker or as a static variable that can be
> referenced later. Simply pass back the string that is given to
> tracepoint_string().
> 
> Had to move the define to include/linux/tracepoint.h so that it is still
> visible when CONFIG_TRACING is not set.
> 
> Link: 
> http://lkml.kernel.org/p/1406318733-26754-2-git-send-email-nicolas.pi...@linaro.org
> 
> Suggested-by: Nicolas Pitre 
> Signed-off-by: Steven Rostedt 
> ---
>  include/linux/ftrace_event.h | 34 --
>  include/linux/tracepoint.h   | 44 
> 
>  2 files changed, 44 insertions(+), 34 deletions(-)
> 
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index cff3106ffe2c..c9f619a2070f 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -574,40 +574,6 @@ do { 
> \
>   __trace_printk(ip, fmt, ##args);\
>  } while (0)
>  
> -/**
> - * tracepoint_string - register constant persistent string to trace system
> - * @str - a constant persistent string that will be referenced in tracepoints
> - *
> - * If constant strings are being used in tracepoints, it is faster and
> - * more efficient to just save the pointer to the string and reference
> - * that with a printf "%s" instead of saving the string in the ring buffer
> - * and wasting space and time.
> - *
> - * The problem with the above approach is that userspace tools that read
> - * the binary output of the trace buffers do not have access to the string.
> - * Instead they just show the address of the string which is not very
> - * useful to users.
> - *
> - * With tracepoint_string(), the string will be registered to the tracing
> - * system and exported to userspace via the debugfs/tracing/printk_formats
> - * file that maps the string address to the string text. This way userspace
> - * tools that read the binary buffers have a way to map the pointers to
> - * the ASCII strings they represent.
> - *
> - * The @str used must be a constant string and persistent as it would not
> - * make sense to show a string that no longer exists. But it is still fine
> - * to be used with modules, because when modules are unloaded, if they
> - * had tracepoints, the ring buffers are cleared too. As long as the string
> - * does not change during the life of the module, it is fine to use
> - * tracepoint_string() within a module.
> - */
> -#define tracepoint_string(str)   
> \
> - ({  \
> - static const char *___tp_str __tracepoint_string = str; \
> - ___tp_str;  \
> - })
> -#define __tracepoint_string  __attribute__((section("__tracepoint_str")))
> -
>  #ifdef CONFIG_PERF_EVENTS
>  struct perf_event;
>  
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index 2e2a5f7717e5..b1293f15f592 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -249,6 +249,50 @@ extern void syscall_unregfunc(void);
>  
>  #endif /* CONFIG_TRACEPOINTS */
>  
> +#ifdef CONFIG_TRACING
> +/**
> + * tracepoint_string - register constant persistent string to trace system
> + * @str - a constant persistent string that will be referenced in tracepoints
> + *
> + * If constant strings are being used in tracepoints, it is faster and
> + * more efficient to just save the pointer to the string and reference
> + * that with a printf "%s" instead of saving the string in the ring buffer
> + * and wasting space and time.
> + *
> + * The problem with the above approach is that userspace tools that read
> + * the binary output of the trace buffers do not have access to the string.
> + * Instead they just show the address of the string which is not very
> + * useful to users.
> + *
> + * With tracepoint_string(), the string will be registered to the tracing
> + * system and exported to userspace via the debugfs/tracing/printk_formats
> + * file that maps the string address to the string text. This way userspace
> + * tools that read the binary buffers have a way to map the pointers to
> + * the ASCII strings they 

Re: [GIT PULL] SELinux/NetLabel fixes for 3.17

2014-08-07 Thread Stephen Rothwell
Hi Paul,

I am not sure who this was directed at, but if you want Linus to pull
these (and that would be good) you need to send this to him explicitly.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


Re: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in zram

2014-08-07 Thread David Horner
 [2/3]


 But why isn't mem_used_max writable? (save tearing down and rebuilding
 device to reset max)

 static DEVICE_ATTR(mem_used_max, S_IRUGO, mem_used_max_show, NULL);

 static DEVICE_ATTR(mem_used_max, S_IRUGO | S_IWUSR, mem_used_max_show, NULL);

   with a check in the store() that the new value is positive and less
than current max?


 I'm also a little puzzled why there is a new API zs_get_max_size_bytes if
 the data is accessible through sysfs?
 Especially if max limit will be (as you propose for [3/3]) through accessed
 through zsmalloc and hence zram needn't access.



  [3/3]
 I concur that the zram limit is best implemented in zsmalloc.
 I am looking forward to that revised code.


> From: Minchan Kim  kernel.org>
> Subject: [RFC 2/3] zsmalloc/zram: add zs_get_max_size_bytes and use it in
> zram
> 
> Newsgroups: gmane.linux.kernel.mm
> , gmane.linux.kernel
> 
> Date: 2014-08-05 08:02:02 GMT (5 hours and 4 minutes ago)
>
> Normally, zram user can get maximum memory zsmalloc consumed via
> polling mem_used_total with sysfs in userspace.
>
> But it has a critical problem because user can miss peak memory
> usage during update interval so that gap between them could be
> huge when memory pressure is really heavy.
>
> This patch adds new API zs_get_max_size_bytes in zsmalloc so
> user(ex, zram) doesn't need to poll in short interval to get
> exact value.
>
> User can just see max memory usage once his test workload is
> done. It's pretty handy and accurate.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Weird NET_RX softirq behavior

2014-08-07 Thread Jisheng Zhang
Hi Dmitry,

On Thu, 7 Aug 2014 07:18:13 -0700
Dmitry Popov  wrote:

> On Thu, 7 Aug 2014 17:10:50 +0800
> Jisheng Zhang  wrote:
> 
> > 2. only one netdev in the system: eth0.
> 
> There should also be lo (loopback) at least.

Yep, I forget that ;)

> 
> > 4. But NET_RX seems abnormal
> > ~ # cat /proc/softirqs 
> > CPU0   CPU1   CPU2   CPU3   
> >   NET_RX: 445587322983  0
> > 
> > I'm expecting NET_RX under CPU1, 2, 3 should be zero. Any suggestions
> > about this abnormal behavior?
> 
> Do you have any loopback traffic? It could be handled by CPU1/2 explaining
> non-zero NET_RX counters.

Yes. lo only send and recive 7 packets so far, about 1400bytes. Seems small
comparing with the CPU1 and CPU2 NET_RX softirq numbers, right?

Any other possible case?

Thanks for your reply,
Jisheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/5] tracing: Do not do anything special with tracepoint_string when tracing is disabled

2014-08-07 Thread Steven Rostedt
Because ftrace_events.h is not included when config tracing is not
enabled, I got error messages when compiling arm and arm64 without
tracing enabled. This is the new patch I'm now testing that moves the
tracepoint_string code to include/linux/tracepoint.h as well.

-- Steve

>From 3c49b52b155d0f723792377e1a4480a0e7ca0ba2 Mon Sep 17 00:00:00 2001
From: Steven Rostedt 
Date: Fri, 25 Jul 2014 16:05:29 -0400
Subject: [PATCH] tracing: Do not do anything special with tracepoint_string
 when tracing is disabled

When CONFIG_TRACING is not enabled, there's no reason to save the trace
strings either by the linker or as a static variable that can be
referenced later. Simply pass back the string that is given to
tracepoint_string().

Had to move the define to include/linux/tracepoint.h so that it is still
visible when CONFIG_TRACING is not set.

Link: 
http://lkml.kernel.org/p/1406318733-26754-2-git-send-email-nicolas.pi...@linaro.org

Suggested-by: Nicolas Pitre 
Signed-off-by: Steven Rostedt 
---
 include/linux/ftrace_event.h | 34 --
 include/linux/tracepoint.h   | 44 
 2 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index cff3106ffe2c..c9f619a2070f 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -574,40 +574,6 @@ do {   
\
__trace_printk(ip, fmt, ##args);\
 } while (0)
 
-/**
- * tracepoint_string - register constant persistent string to trace system
- * @str - a constant persistent string that will be referenced in tracepoints
- *
- * If constant strings are being used in tracepoints, it is faster and
- * more efficient to just save the pointer to the string and reference
- * that with a printf "%s" instead of saving the string in the ring buffer
- * and wasting space and time.
- *
- * The problem with the above approach is that userspace tools that read
- * the binary output of the trace buffers do not have access to the string.
- * Instead they just show the address of the string which is not very
- * useful to users.
- *
- * With tracepoint_string(), the string will be registered to the tracing
- * system and exported to userspace via the debugfs/tracing/printk_formats
- * file that maps the string address to the string text. This way userspace
- * tools that read the binary buffers have a way to map the pointers to
- * the ASCII strings they represent.
- *
- * The @str used must be a constant string and persistent as it would not
- * make sense to show a string that no longer exists. But it is still fine
- * to be used with modules, because when modules are unloaded, if they
- * had tracepoints, the ring buffers are cleared too. As long as the string
- * does not change during the life of the module, it is fine to use
- * tracepoint_string() within a module.
- */
-#define tracepoint_string(str) \
-   ({  \
-   static const char *___tp_str __tracepoint_string = str; \
-   ___tp_str;  \
-   })
-#define __tracepoint_string__attribute__((section("__tracepoint_str")))
-
 #ifdef CONFIG_PERF_EVENTS
 struct perf_event;
 
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 2e2a5f7717e5..b1293f15f592 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -249,6 +249,50 @@ extern void syscall_unregfunc(void);
 
 #endif /* CONFIG_TRACEPOINTS */
 
+#ifdef CONFIG_TRACING
+/**
+ * tracepoint_string - register constant persistent string to trace system
+ * @str - a constant persistent string that will be referenced in tracepoints
+ *
+ * If constant strings are being used in tracepoints, it is faster and
+ * more efficient to just save the pointer to the string and reference
+ * that with a printf "%s" instead of saving the string in the ring buffer
+ * and wasting space and time.
+ *
+ * The problem with the above approach is that userspace tools that read
+ * the binary output of the trace buffers do not have access to the string.
+ * Instead they just show the address of the string which is not very
+ * useful to users.
+ *
+ * With tracepoint_string(), the string will be registered to the tracing
+ * system and exported to userspace via the debugfs/tracing/printk_formats
+ * file that maps the string address to the string text. This way userspace
+ * tools that read the binary buffers have a way to map the pointers to
+ * the ASCII strings they represent.
+ *
+ * The @str used must be a constant string and persistent as it would not
+ * make sense to show a string that no longer exists. But it is still fine
+ * to be used with modules, because when modules are unloaded, if they
+ * had tracepoints, the ring 

[PATCH] Removed repeated word in comments in arch/tile/include/uapi/arch/sim_def.h

2014-08-07 Thread Kurt McAlpine
Hello,

I have created a simple patch that removes repeated words in 
arch/tile/include/uapi/arch/sim_def.h

Thanks,
Kurt McAlpine
>From 7496bc191f1ea2fde0af9d316e5330bc99e5638d Mon Sep 17 00:00:00 2001
From: Kurt McAlpine 
Date: Thu, 7 Aug 2014 08:50:25 +1200
Subject: [PATCH] Removed repeated word in comments
Signed-off-by: Kurt McAlpine 

---
 arch/tile/include/uapi/arch/sim_def.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/tile/include/uapi/arch/sim_def.h b/arch/tile/include/uapi/arch/sim_def.h
index 4b44a2b..1c06953 100644
--- a/arch/tile/include/uapi/arch/sim_def.h
+++ b/arch/tile/include/uapi/arch/sim_def.h
@@ -360,19 +360,19 @@
  * @{
  */
 
-/** Use with with SIM_PROFILER_CHIP_xxx to control the memory controllers. */
+/** Use with SIM_PROFILER_CHIP_xxx to control the memory controllers. */
 #define SIM_CHIP_MEMCTL0x001
 
-/** Use with with SIM_PROFILER_CHIP_xxx to control the XAUI interface. */
+/** Use with SIM_PROFILER_CHIP_xxx to control the XAUI interface. */
 #define SIM_CHIP_XAUI  0x002
 
-/** Use with with SIM_PROFILER_CHIP_xxx to control the PCIe interface. */
+/** Use with SIM_PROFILER_CHIP_xxx to control the PCIe interface. */
 #define SIM_CHIP_PCIE  0x004
 
-/** Use with with SIM_PROFILER_CHIP_xxx to control the MPIPE interface. */
+/** Use with SIM_PROFILER_CHIP_xxx to control the MPIPE interface. */
 #define SIM_CHIP_MPIPE 0x008
 
-/** Use with with SIM_PROFILER_CHIP_xxx to control the TRIO interface. */
+/** Use with SIM_PROFILER_CHIP_xxx to control the TRIO interface. */
 #define SIM_CHIP_TRIO  0x010
 
 /** Reference all chip devices. */
-- 
1.9.1



Re: Linux 3.15.9

2014-08-07 Thread Greg KH

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index c584a51add15..afe68ddbe6a4 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,8 @@ c900 - e8ff (=45 bits) 
vmalloc/ioremap space
 e900 - e9ff (=40 bits) hole
 ea00 - eaff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
+ff00 - ff7f (=39 bits) %esp fixup stacks
+... unused hole ...
 8000 - a000 (=512 MB)  kernel text mapping, from phys 0
 a000 - ff5f (=1525 MB) module mapping space
 ff60 - ffdf (=8 MB) vsyscalls
diff --git a/Makefile b/Makefile
index d5d9a22a404a..25b85aba1e2e 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 15
-SUBLEVEL = 8
+SUBLEVEL = 9
 EXTRAVERSION =
 NAME = Double Funky Skunk
 
diff --git a/arch/arm/boot/dts/dra7-evm.dts b/arch/arm/boot/dts/dra7-evm.dts
index 5babba0a3a75..904dcf5973f3 100644
--- a/arch/arm/boot/dts/dra7-evm.dts
+++ b/arch/arm/boot/dts/dra7-evm.dts
@@ -182,6 +182,7 @@
regulator-name = "ldo3";
regulator-min-microvolt = <180>;
regulator-max-microvolt = <180>;
+   regulator-always-on;
regulator-boot-on;
};
 
diff --git a/arch/arm/boot/dts/hi3620.dtsi b/arch/arm/boot/dts/hi3620.dtsi
index ab1116d086be..83a5b8685bd9 100644
--- a/arch/arm/boot/dts/hi3620.dtsi
+++ b/arch/arm/boot/dts/hi3620.dtsi
@@ -73,7 +73,7 @@
 
L2: l2-cache {
compatible = "arm,pl310-cache";
-   reg = <0xfc1 0x10>;
+   reg = <0x10 0x10>;
interrupts = <0 15 4>;
cache-unified;
cache-level = <2>;
diff --git a/arch/arm/crypto/aesbs-glue.c b/arch/arm/crypto/aesbs-glue.c
index 4522366da759..15468fbbdea3 100644
--- a/arch/arm/crypto/aesbs-glue.c
+++ b/arch/arm/crypto/aesbs-glue.c
@@ -137,7 +137,7 @@ static int aesbs_cbc_encrypt(struct blkcipher_desc *desc,
dst += AES_BLOCK_SIZE;
} while (--blocks);
}
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
@@ -158,7 +158,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc,
bsaes_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
  walk.nbytes, >dec, walk.iv);
kernel_neon_end();
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
while (walk.nbytes) {
u32 blocks = walk.nbytes / AES_BLOCK_SIZE;
@@ -182,7 +182,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc,
dst += AES_BLOCK_SIZE;
src += AES_BLOCK_SIZE;
} while (--blocks);
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
@@ -268,7 +268,7 @@ static int aesbs_xts_encrypt(struct blkcipher_desc *desc,
bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
  walk.nbytes, >enc, walk.iv);
kernel_neon_end();
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
@@ -292,7 +292,7 @@ static int aesbs_xts_decrypt(struct blkcipher_desc *desc,
bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
  walk.nbytes, >dec, walk.iv);
kernel_neon_end();
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
diff --git a/arch/arm/mach-omap2/gpmc-nand.c b/arch/arm/mach-omap2/gpmc-nand.c
index 17cd39360afe..93914d220069 100644
--- a/arch/arm/mach-omap2/gpmc-nand.c
+++ b/arch/arm/mach-omap2/gpmc-nand.c
@@ -50,6 +50,16 @@ static bool gpmc_hwecc_bch_capable(enum omap_ecc ecc_opt)
 soc_is_omap54xx() || soc_is_dra7xx())
return 1;
 
+   if (ecc_opt == OMAP_ECC_BCH4_CODE_HW_DETECTION_SW ||
+ecc_opt == OMAP_ECC_BCH8_CODE_HW_DETECTION_SW) {
+   if (cpu_is_omap24xx())
+   return 0;
+   else if (cpu_is_omap3630() && (GET_OMAP_REVISION() == 0))
+   return 0;
+   else
+ 

Linux 3.14.16

2014-08-07 Thread Greg KH
I'm announcing the release of the 3.14.16 kernel.

All users of the 3.14 kernel series must upgrade.

The updated 3.14.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.14.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Documentation/x86/x86_64/mm.txt |2 
 Makefile|2 
 arch/arm/boot/dts/dra7-evm.dts  |1 
 arch/arm/boot/dts/hi3620.dtsi   |2 
 arch/arm/crypto/aesbs-glue.c|   10 -
 arch/arm/mm/idmap.c |7 +
 arch/arm/mm/mmu.c   |6 
 arch/x86/Kconfig|   25 +++
 arch/x86/include/asm/espfix.h   |   16 ++
 arch/x86/include/asm/irqflags.h |2 
 arch/x86/include/asm/pgtable_64_types.h |2 
 arch/x86/include/asm/setup.h|2 
 arch/x86/kernel/Makefile|1 
 arch/x86/kernel/entry_32.S  |   12 +
 arch/x86/kernel/entry_64.S  |   77 ++-
 arch/x86/kernel/espfix_64.c |  208 
 arch/x86/kernel/ldt.c   |   10 -
 arch/x86/kernel/paravirt_patch_64.c |2 
 arch/x86/kernel/smpboot.c   |7 +
 arch/x86/mm/dump_pagetables.c   |   31 +++-
 arch/x86/vdso/vdso32-setup.c|8 -
 arch/x86/xen/setup.c|9 -
 arch/xtensa/kernel/vectors.S|  158 
 arch/xtensa/kernel/vmlinux.lds.S|4 
 crypto/af_alg.c |2 
 drivers/cpufreq/cpufreq.c   |6 
 drivers/iio/accel/bma180.c  |8 -
 drivers/iio/industrialio-buffer.c   |2 
 drivers/md/dm-bufio.c   |2 
 drivers/md/dm-cache-target.c|   13 --
 drivers/net/wireless/ath/ath9k/xmit.c   |9 +
 drivers/pnp/pnpacpi/core.c  |3 
 drivers/rapidio/devices/tsi721_dma.c|8 +
 drivers/scsi/scsi_lib.c |8 +
 drivers/staging/vt6655/bssdb.c  |2 
 drivers/staging/vt6655/device_main.c|7 -
 include/dt-bindings/pinctrl/dra.h   |7 -
 include/linux/printk.h  |6 
 init/main.c |4 
 kernel/printk/printk.c  |2 
 kernel/sched/core.c |2 
 kernel/sched/deadline.c |2 
 kernel/sched/rt.c   |2 
 kernel/time/clockevents.c   |   10 -
 kernel/time/sched_clock.c   |4 
 lib/btree.c |1 
 mm/memcontrol.c |4 
 mm/page-writeback.c |6 
 mm/page_alloc.c |   16 +-
 net/l2tp/l2tp_ppp.c |4 
 net/mac80211/tx.c   |   27 ++--
 net/wireless/trace.h|3 
 52 files changed, 628 insertions(+), 146 deletions(-)

Alexandre Bounine (1):
  rapidio/tsi721_dma: fix failure to obtain transaction descriptor

Andy Lutomirski (1):
  x86_64/entry/xen: Do not invoke espfix64 on Xen

Anssi Hannula (1):
  dm cache: fix race affecting dirty block count

Boris Ostrovsky (1):
  x86/espfix/xen: Fix allocation of pages for paravirt page tables

David Rientjes (1):
  mm, thp: do not allow thp faults to avoid cpuset restrictions

David Vrabel (1):
  x86/xen: no need to explicitly register an NMI callback

Eliad Peller (1):
  cfg80211: fix mic_failure tracing

Felix Fietkau (1):
  ath9k: fix aggregation session lockup

Greg Kroah-Hartman (1):
  Linux 3.14.16

Greg Thelen (1):
  dm bufio: fully initialize shrinker

H. Peter Anvin (6):
  Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime 
option"
  x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
  x86, espfix: Move espfix definitions into a separate header file
  x86, espfix: Fix broken header guard
  x86, espfix: Make espfix64 a Kconfig option, fix UML
  x86, espfix: Make it possible to disable 16-bit support

Haojian Zhuang (1):
  ARM: dts: fix L2 address in Hi3620

James Bottomley (1):
  scsi: handle flush errors properly

Jan Kara (1):
  timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks

Johannes Berg (1):
  Revert "mac80211: move "bufferable MMPDU" check to fix AP mode scan"

John Stultz (1):
  printk: rename printk_sched to printk_deferred

Konstantin Khlebnikov (1):
  ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout

Lars-Peter Clausen (1):
  iio: buffer: Fix demux table creation

Malcolm Priestley (2):
  staging: vt6655: Fix disassociated messages every 10 seconds
  staging: vt6655: Fix Warning on boot handle_irq_event_percpu.


Linux 3.15.9

2014-08-07 Thread Greg KH
I'm announcing the release of the 3.15.9 kernel.

All users of the 3.15 kernel series must upgrade.

NOTE, there will only be 1 more 3.15.y kernel release after this one,
please move to 3.16.y now, you have been warned.

The updated 3.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.15.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Documentation/x86/x86_64/mm.txt |2 
 Makefile|2 
 arch/arm/boot/dts/dra7-evm.dts  |1 
 arch/arm/boot/dts/hi3620.dtsi   |2 
 arch/arm/crypto/aesbs-glue.c|   10 -
 arch/arm/mach-omap2/gpmc-nand.c |   18 +-
 arch/arm/mm/idmap.c |7 +
 arch/arm/mm/mmu.c   |6 
 arch/powerpc/perf/core-book3s.c |6 
 arch/x86/Kconfig|   25 +++
 arch/x86/include/asm/espfix.h   |   16 ++
 arch/x86/include/asm/irqflags.h |2 
 arch/x86/include/asm/pgtable_64_types.h |2 
 arch/x86/include/asm/setup.h|2 
 arch/x86/kernel/Makefile|1 
 arch/x86/kernel/entry_32.S  |   12 +
 arch/x86/kernel/entry_64.S  |   77 ++-
 arch/x86/kernel/espfix_64.c |  208 
 arch/x86/kernel/ldt.c   |   10 -
 arch/x86/kernel/paravirt_patch_64.c |2 
 arch/x86/kernel/smpboot.c   |7 +
 arch/x86/mm/dump_pagetables.c   |   44 +-
 arch/x86/vdso/vdso32-setup.c|8 -
 arch/x86/xen/setup.c|9 -
 arch/xtensa/kernel/vectors.S|  158 
 arch/xtensa/kernel/vmlinux.lds.S|4 
 crypto/af_alg.c |2 
 drivers/gpu/drm/i915/intel_display.c|3 
 drivers/iio/accel/bma180.c  |8 -
 drivers/iio/industrialio-buffer.c   |2 
 drivers/md/dm-bufio.c   |2 
 drivers/md/dm-cache-target.c|   13 --
 drivers/net/wireless/ath/ath9k/xmit.c   |9 +
 drivers/pnp/pnpacpi/core.c  |3 
 drivers/rapidio/devices/tsi721_dma.c|8 +
 drivers/scsi/scsi_lib.c |8 +
 drivers/staging/vt6655/bssdb.c  |2 
 drivers/staging/vt6655/device_main.c|7 -
 fs/open.c   |5 
 include/dt-bindings/pinctrl/dra.h   |7 -
 include/linux/printk.h  |6 
 init/main.c |4 
 kernel/printk/printk.c  |2 
 kernel/sched/core.c |2 
 kernel/sched/deadline.c |2 
 kernel/sched/rt.c   |2 
 kernel/time/clockevents.c   |   10 -
 kernel/time/sched_clock.c   |4 
 lib/btree.c |1 
 mm/memcontrol.c |4 
 mm/page-writeback.c |6 
 mm/page_alloc.c |   16 +-
 net/l2tp/l2tp_ppp.c |4 
 net/mac80211/tx.c   |   20 +--
 net/wireless/trace.h|3 
 55 files changed, 652 insertions(+), 154 deletions(-)

Alexandre Bounine (1):
  rapidio/tsi721_dma: fix failure to obtain transaction descriptor

Andy Lutomirski (1):
  x86_64/entry/xen: Do not invoke espfix64 on Xen

Anssi Hannula (1):
  dm cache: fix race affecting dirty block count

Boris Ostrovsky (1):
  x86/espfix/xen: Fix allocation of pages for paravirt page tables

Christoph Fritz (1):
  ARM: OMAP2+: gpmc: fix gpmc_hwecc_bch_capable()

David Rientjes (1):
  mm, thp: do not allow thp faults to avoid cpuset restrictions

David Vrabel (1):
  x86/xen: no need to explicitly register an NMI callback

Eliad Peller (1):
  cfg80211: fix mic_failure tracing

Eric Biggers (1):
  vfs: fix check for fallocate on active swapfile

Felix Fietkau (1):
  ath9k: fix aggregation session lockup

Greg Kroah-Hartman (1):
  Linux 3.15.9

Greg Thelen (1):
  dm bufio: fully initialize shrinker

H. Peter Anvin (6):
  Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime 
option"
  x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
  x86, espfix: Move espfix definitions into a separate header file
  x86, espfix: Fix broken header guard
  x86, espfix: Make espfix64 a Kconfig option, fix UML
  x86, espfix: Make it possible to disable 16-bit support

Haojian Zhuang (1):
  ARM: dts: fix L2 address in Hi3620

James Bottomley (1):
  scsi: handle flush errors properly

Jan Kara (1):
  timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks

Johannes Berg (1):
  Revert "mac80211: move "bufferable MMPDU" check to fix AP 

Re: Linux 3.14.16

2014-08-07 Thread Greg KH

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index c584a51add15..afe68ddbe6a4 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,8 @@ c900 - e8ff (=45 bits) 
vmalloc/ioremap space
 e900 - e9ff (=40 bits) hole
 ea00 - eaff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
+ff00 - ff7f (=39 bits) %esp fixup stacks
+... unused hole ...
 8000 - a000 (=512 MB)  kernel text mapping, from phys 0
 a000 - ff5f (=1525 MB) module mapping space
 ff60 - ffdf (=8 MB) vsyscalls
diff --git a/Makefile b/Makefile
index 188523e9e880..8b22e24a2d8e 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 14
-SUBLEVEL = 15
+SUBLEVEL = 16
 EXTRAVERSION =
 NAME = Remembering Coco
 
diff --git a/arch/arm/boot/dts/dra7-evm.dts b/arch/arm/boot/dts/dra7-evm.dts
index 5babba0a3a75..904dcf5973f3 100644
--- a/arch/arm/boot/dts/dra7-evm.dts
+++ b/arch/arm/boot/dts/dra7-evm.dts
@@ -182,6 +182,7 @@
regulator-name = "ldo3";
regulator-min-microvolt = <180>;
regulator-max-microvolt = <180>;
+   regulator-always-on;
regulator-boot-on;
};
 
diff --git a/arch/arm/boot/dts/hi3620.dtsi b/arch/arm/boot/dts/hi3620.dtsi
index ab1116d086be..83a5b8685bd9 100644
--- a/arch/arm/boot/dts/hi3620.dtsi
+++ b/arch/arm/boot/dts/hi3620.dtsi
@@ -73,7 +73,7 @@
 
L2: l2-cache {
compatible = "arm,pl310-cache";
-   reg = <0xfc1 0x10>;
+   reg = <0x10 0x10>;
interrupts = <0 15 4>;
cache-unified;
cache-level = <2>;
diff --git a/arch/arm/crypto/aesbs-glue.c b/arch/arm/crypto/aesbs-glue.c
index 4522366da759..15468fbbdea3 100644
--- a/arch/arm/crypto/aesbs-glue.c
+++ b/arch/arm/crypto/aesbs-glue.c
@@ -137,7 +137,7 @@ static int aesbs_cbc_encrypt(struct blkcipher_desc *desc,
dst += AES_BLOCK_SIZE;
} while (--blocks);
}
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
@@ -158,7 +158,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc,
bsaes_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
  walk.nbytes, >dec, walk.iv);
kernel_neon_end();
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
while (walk.nbytes) {
u32 blocks = walk.nbytes / AES_BLOCK_SIZE;
@@ -182,7 +182,7 @@ static int aesbs_cbc_decrypt(struct blkcipher_desc *desc,
dst += AES_BLOCK_SIZE;
src += AES_BLOCK_SIZE;
} while (--blocks);
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
@@ -268,7 +268,7 @@ static int aesbs_xts_encrypt(struct blkcipher_desc *desc,
bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
  walk.nbytes, >enc, walk.iv);
kernel_neon_end();
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
@@ -292,7 +292,7 @@ static int aesbs_xts_decrypt(struct blkcipher_desc *desc,
bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
  walk.nbytes, >dec, walk.iv);
kernel_neon_end();
-   err = blkcipher_walk_done(desc, , 0);
+   err = blkcipher_walk_done(desc, , walk.nbytes % 
AES_BLOCK_SIZE);
}
return err;
 }
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index 8e0e52eb76b5..d7a0ee898d24 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -25,6 +25,13 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
pr_warning("Failed to allocate identity pmd.\n");
return;
}
+   /*
+* Copy the original PMD to ensure that the PMD entries for
+* the kernel image are preserved.
+*/
+   if (!pud_none(*pud))
+   memcpy(pmd, pmd_offset(pud, 0),
+  PTRS_PER_PMD * sizeof(pmd_t));
 

Linux 3.10.52

2014-08-07 Thread Greg KH
I'm announcing the release of the 3.10.52 kernel.

All users of the 3.10 kernel series must upgrade.

The updated 3.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.10.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Documentation/x86/x86_64/mm.txt |2 
 Makefile|2 
 arch/arm/mm/idmap.c |7 +
 arch/x86/Kconfig|   25 +++
 arch/x86/include/asm/espfix.h   |   16 ++
 arch/x86/include/asm/irqflags.h |2 
 arch/x86/include/asm/pgtable_64_types.h |2 
 arch/x86/include/asm/setup.h|2 
 arch/x86/kernel/Makefile|1 
 arch/x86/kernel/entry_32.S  |   12 +
 arch/x86/kernel/entry_64.S  |   77 ++-
 arch/x86/kernel/espfix_64.c |  208 
 arch/x86/kernel/ldt.c   |   10 -
 arch/x86/kernel/paravirt_patch_64.c |2 
 arch/x86/kernel/smpboot.c   |7 +
 arch/x86/mm/dump_pagetables.c   |   39 --
 arch/x86/vdso/vdso32-setup.c|8 -
 crypto/af_alg.c |2 
 drivers/iio/industrialio-buffer.c   |2 
 drivers/net/ethernet/marvell/mvneta.c   |  206 ---
 drivers/rapidio/devices/tsi721_dma.c|8 +
 drivers/scsi/scsi_lib.c |8 +
 drivers/staging/vt6655/bssdb.c  |2 
 drivers/staging/vt6655/device_main.c|7 -
 include/linux/printk.h  |6 
 init/main.c |4 
 kernel/printk.c |2 
 kernel/sched/core.c |2 
 kernel/sched/rt.c   |2 
 kernel/time/clockevents.c   |   10 -
 lib/btree.c |1 
 mm/page_alloc.c |   16 +-
 net/l2tp/l2tp_ppp.c |4 
 net/mac80211/tx.c   |   27 ++--
 net/wireless/trace.h|3 
 35 files changed, 554 insertions(+), 180 deletions(-)

Alexandre Bounine (1):
  rapidio/tsi721_dma: fix failure to obtain transaction descriptor

Andy Lutomirski (1):
  x86_64/entry/xen: Do not invoke espfix64 on Xen

Boris Ostrovsky (1):
  x86/espfix/xen: Fix allocation of pages for paravirt page tables

David Rientjes (1):
  mm, thp: do not allow thp faults to avoid cpuset restrictions

Eliad Peller (1):
  cfg80211: fix mic_failure tracing

Greg Kroah-Hartman (1):
  Linux 3.10.52

H. Peter Anvin (6):
  Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime 
option"
  x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
  x86, espfix: Move espfix definitions into a separate header file
  x86, espfix: Fix broken header guard
  x86, espfix: Make espfix64 a Kconfig option, fix UML
  x86, espfix: Make it possible to disable 16-bit support

James Bottomley (1):
  scsi: handle flush errors properly

Jan Kara (1):
  timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks

Johannes Berg (1):
  Revert "mac80211: move "bufferable MMPDU" check to fix AP mode scan"

John Stultz (1):
  printk: rename printk_sched to printk_deferred

Konstantin Khlebnikov (1):
  ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout

Lars-Peter Clausen (1):
  iio: buffer: Fix demux table creation

Malcolm Priestley (2):
  staging: vt6655: Fix disassociated messages every 10 seconds
  staging: vt6655: Fix Warning on boot handle_irq_event_percpu.

Milan Broz (1):
  crypto: af_alg - properly label AF_ALG socket

Minfei Huang (1):
  lib/btree.c: fix leak of whole btree nodes

Sasha Levin (1):
  net/l2tp: don't fall back on UDP [get|set]sockopt

willy tarreau (5):
  net: mvneta: increase the 64-bit rx/tx stats out of the hot path
  net: mvneta: use per_cpu stats to fix an SMP lock up
  net: mvneta: do not schedule in mvneta_tx_timeout
  net: mvneta: add missing bit descriptions for interrupt masks and causes
  net: mvneta: replace Tx timer with a real interrupt



pgpNetkU_wCnP.pgp
Description: PGP signature


Re: Linux 3.10.52

2014-08-07 Thread Greg KH
diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 881582f75c9c..bd4370487b07 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,8 @@ c900 - e8ff (=45 bits) 
vmalloc/ioremap space
 e900 - e9ff (=40 bits) hole
 ea00 - eaff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
+ff00 - ff7f (=39 bits) %esp fixup stacks
+... unused hole ...
 8000 - a000 (=512 MB)  kernel text mapping, from phys 0
 a000 - ff5f (=1525 MB) module mapping space
 ff60 - ffdf (=8 MB) vsyscalls
diff --git a/Makefile b/Makefile
index f9f6ee59c61a..b94f00938acc 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 10
-SUBLEVEL = 51
+SUBLEVEL = 52
 EXTRAVERSION =
 NAME = TOSSUG Baby Fish
 
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index 83cb3ac27095..c61d2373408c 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -24,6 +24,13 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
pr_warning("Failed to allocate identity pmd.\n");
return;
}
+   /*
+* Copy the original PMD to ensure that the PMD entries for
+* the kernel image are preserved.
+*/
+   if (!pud_none(*pud))
+   memcpy(pmd, pmd_offset(pud, 0),
+  PTRS_PER_PMD * sizeof(pmd_t));
pud_populate(_mm, pud, pmd);
pmd += pmd_index(addr);
} else
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index af88b27ce313..a649cb686692 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -952,10 +952,27 @@ config VM86
default y
depends on X86_32
---help---
- This option is required by programs like DOSEMU to run 16-bit legacy
- code on X86 processors. It also may be needed by software like
- XFree86 to initialize some video cards via BIOS. Disabling this
- option saves about 6k.
+ This option is required by programs like DOSEMU to run
+ 16-bit real mode legacy code on x86 processors. It also may
+ be needed by software like XFree86 to initialize some video
+ cards via BIOS. Disabling this option saves about 6K.
+
+config X86_16BIT
+   bool "Enable support for 16-bit segments" if EXPERT
+   default y
+   ---help---
+ This option is required by programs like Wine to run 16-bit
+ protected mode legacy code on x86 processors.  Disabling
+ this option saves about 300 bytes on i386, or around 6K text
+ plus 16K runtime memory on x86-64,
+
+config X86_ESPFIX32
+   def_bool y
+   depends on X86_16BIT && X86_32
+
+config X86_ESPFIX64
+   def_bool y
+   depends on X86_16BIT && X86_64
 
 config TOSHIBA
tristate "Toshiba Laptop support"
diff --git a/arch/x86/include/asm/espfix.h b/arch/x86/include/asm/espfix.h
new file mode 100644
index ..99efebb2f69d
--- /dev/null
+++ b/arch/x86/include/asm/espfix.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_X86_ESPFIX_H
+#define _ASM_X86_ESPFIX_H
+
+#ifdef CONFIG_X86_64
+
+#include 
+
+DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack);
+DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr);
+
+extern void init_espfix_bsp(void);
+extern void init_espfix_ap(void);
+
+#endif /* CONFIG_X86_64 */
+
+#endif /* _ASM_X86_ESPFIX_H */
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index bba3cf88e624..0a8b519226b8 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -129,7 +129,7 @@ static inline notrace unsigned long 
arch_local_irq_save(void)
 
 #define PARAVIRT_ADJUST_EXCEPTION_FRAME/*  */
 
-#define INTERRUPT_RETURN   iretq
+#define INTERRUPT_RETURN   jmp native_iret
 #define USERGS_SYSRET64\
swapgs; \
sysretq;
diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 2d883440cb9a..b1609f2c524c 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -61,6 +61,8 @@ typedef struct { pteval_t pte; } pte_t;
 #define MODULES_VADDR_AC(0xa000, UL)
 #define MODULES_END  _AC(0xff00, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
+#define ESPFIX_PGD_ENTRY _AC(-2, UL)
+#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << PGDIR_SHIFT)
 
 #define EARLY_DYNAMIC_PAGE_TABLES  64
 
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index b7bf3505e1ec..2e327f114a1b 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -62,6 +62,8 @@ static inline void 

Re: Linux 3.4.102

2014-08-07 Thread Greg KH

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index d6498e3cd713..f33a9369e35b 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,6 +12,8 @@ c900 - e8ff (=45 bits) 
vmalloc/ioremap space
 e900 - e9ff (=40 bits) hole
 ea00 - eaff (=40 bits) virtual memory map (1TB)
 ... unused hole ...
+ff00 - ff7f (=39 bits) %esp fixup stacks
+... unused hole ...
 8000 - a000 (=512 MB)  kernel text mapping, from phys 0
 a000 - fff0 (=1536 MB) module mapping space
 
diff --git a/Makefile b/Makefile
index a22bcb567348..dd03fa5777a0 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 4
-SUBLEVEL = 101
+SUBLEVEL = 102
 EXTRAVERSION =
 NAME = Saber-toothed Squirrel
 
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index ab88ed4f8e08..ef8f2df02540 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -22,6 +22,13 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, 
unsigned long end,
pr_warning("Failed to allocate identity pmd.\n");
return;
}
+   /*
+* Copy the original PMD to ensure that the PMD entries for
+* the kernel image are preserved.
+*/
+   if (!pud_none(*pud))
+   memcpy(pmd, pmd_offset(pud, 0),
+  PTRS_PER_PMD * sizeof(pmd_t));
pud_populate(_mm, pud, pmd);
pmd += pmd_index(addr);
} else
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9df4ea1caaf1..917c1098775b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -915,10 +915,27 @@ config VM86
default y
depends on X86_32
---help---
- This option is required by programs like DOSEMU to run 16-bit legacy
- code on X86 processors. It also may be needed by software like
- XFree86 to initialize some video cards via BIOS. Disabling this
- option saves about 6k.
+ This option is required by programs like DOSEMU to run
+ 16-bit real mode legacy code on x86 processors. It also may
+ be needed by software like XFree86 to initialize some video
+ cards via BIOS. Disabling this option saves about 6K.
+
+config X86_16BIT
+   bool "Enable support for 16-bit segments" if EXPERT
+   default y
+   ---help---
+ This option is required by programs like Wine to run 16-bit
+ protected mode legacy code on x86 processors.  Disabling
+ this option saves about 300 bytes on i386, or around 6K text
+ plus 16K runtime memory on x86-64,
+
+config X86_ESPFIX32
+   def_bool y
+   depends on X86_16BIT && X86_32
+
+config X86_ESPFIX64
+   def_bool y
+   depends on X86_16BIT && X86_64
 
 config TOSHIBA
tristate "Toshiba Laptop support"
diff --git a/arch/x86/include/asm/espfix.h b/arch/x86/include/asm/espfix.h
new file mode 100644
index ..99efebb2f69d
--- /dev/null
+++ b/arch/x86/include/asm/espfix.h
@@ -0,0 +1,16 @@
+#ifndef _ASM_X86_ESPFIX_H
+#define _ASM_X86_ESPFIX_H
+
+#ifdef CONFIG_X86_64
+
+#include 
+
+DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack);
+DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr);
+
+extern void init_espfix_bsp(void);
+extern void init_espfix_ap(void);
+
+#endif /* CONFIG_X86_64 */
+
+#endif /* _ASM_X86_ESPFIX_H */
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index bba3cf88e624..0a8b519226b8 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -129,7 +129,7 @@ static inline notrace unsigned long 
arch_local_irq_save(void)
 
 #define PARAVIRT_ADJUST_EXCEPTION_FRAME/*  */
 
-#define INTERRUPT_RETURN   iretq
+#define INTERRUPT_RETURN   jmp native_iret
 #define USERGS_SYSRET64\
swapgs; \
sysretq;
diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 766ea16fbbbd..51817fae7047 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -59,5 +59,7 @@ typedef struct { pteval_t pte; } pte_t;
 #define MODULES_VADDR_AC(0xa000, UL)
 #define MODULES_END  _AC(0xff00, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
+#define ESPFIX_PGD_ENTRY _AC(-2, UL)
+#define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << PGDIR_SHIFT)
 
 #endif /* _ASM_X86_PGTABLE_64_DEFS_H */
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index d0f19f9fb846..16c7971457f8 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -61,6 +61,8 @@ static inline void x86_ce4100_early_setup(void) { }
 
 #ifndef _SETUP
 

Linux 3.4.102

2014-08-07 Thread Greg KH
I'm announcing the release of the 3.4.102 kernel.

All users of the 3.4 kernel series must upgrade.

The updated 3.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Documentation/x86/x86_64/mm.txt |2 
 Makefile|2 
 arch/arm/mm/idmap.c |7 +
 arch/x86/Kconfig|   25 +++
 arch/x86/include/asm/espfix.h   |   16 ++
 arch/x86/include/asm/irqflags.h |2 
 arch/x86/include/asm/pgtable_64_types.h |2 
 arch/x86/include/asm/setup.h|2 
 arch/x86/kernel/Makefile|1 
 arch/x86/kernel/entry_32.S  |   12 +
 arch/x86/kernel/entry_64.S  |   80 ++--
 arch/x86/kernel/espfix_64.c |  208 
 arch/x86/kernel/ldt.c   |   10 -
 arch/x86/kernel/paravirt_patch_64.c |2 
 arch/x86/kernel/smpboot.c   |7 +
 arch/x86/mm/dump_pagetables.c   |   39 --
 arch/x86/vdso/vdso32-setup.c|8 -
 crypto/af_alg.c |2 
 drivers/scsi/scsi_lib.c |8 +
 include/linux/printk.h  |6 
 include/linux/skbuff.h  |   17 --
 init/main.c |4 
 kernel/printk.c |2 
 kernel/sched/core.c |2 
 kernel/sched/rt.c   |2 
 kernel/time/clockevents.c   |   10 -
 lib/btree.c |1 
 mm/mlock.c  |2 
 mm/page_alloc.c |   16 +-
 mm/rmap.c   |   14 +-
 net/ipv4/ip_forward.c   |   68 --
 net/ipv6/addrconf.c |   14 +-
 net/ipv6/ip6_output.c   |   13 --
 net/l2tp/l2tp_ppp.c |4 
 34 files changed, 446 insertions(+), 164 deletions(-)

Andy Lutomirski (1):
  x86_64/entry/xen: Do not invoke espfix64 on Xen

Boris Ostrovsky (1):
  x86/espfix/xen: Fix allocation of pages for paravirt page tables

David Rientjes (1):
  mm, thp: do not allow thp faults to avoid cpuset restrictions

Gao feng (1):
  ipv6: reallocate addrconf router for ipv6 address when lo device up

Greg Kroah-Hartman (2):
  Revert: "net: ip, ipv6: handle gso skbs in forwarding path"
  Linux 3.4.102

H. Peter Anvin (6):
  Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime 
option"
  x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
  x86, espfix: Move espfix definitions into a separate header file
  x86, espfix: Fix broken header guard
  x86, espfix: Make espfix64 a Kconfig option, fix UML
  x86, espfix: Make it possible to disable 16-bit support

James Bottomley (1):
  scsi: handle flush errors properly

Jan Kara (1):
  timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks

John Stultz (1):
  printk: rename printk_sched to printk_deferred

Konstantin Khlebnikov (1):
  ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout

Milan Broz (1):
  crypto: af_alg - properly label AF_ALG socket

Minfei Huang (1):
  lib/btree.c: fix leak of whole btree nodes

Sasha Levin (1):
  net/l2tp: don't fall back on UDP [get|set]sockopt

Vlastimil Babka (1):
  mm: try_to_unmap_cluster() should lock_page() before mlocking



pgpGgP6pQ5XLZ.pgp
Description: PGP signature


Re: [PATCH] staging: vt6655: wpactl.c: Fix sparse warnings

2014-08-07 Thread Greg Kroah-Hartman
On Thu, Aug 07, 2014 at 11:08:34PM +0100, Martin Berglund wrote:
> Add missing __user macro casting in the function wpa_set_keys.
> This is okay since the function handles the possibility of
> param->u.wpa_key.key and param->u.wpa_key.seq pointing to
> kernelspace using a flag, fcpfkernel.
> 
> Signed-off-by: Martin Berglund 
> ---
> This was submitted as part of Eudyptula challenge task 16
> 
>  drivers/staging/vt6655/wpactl.c |8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/vt6655/wpactl.c b/drivers/staging/vt6655/wpactl.c
> index 5f454ca..d75dd79 100644
> --- a/drivers/staging/vt6655/wpactl.c
> +++ b/drivers/staging/vt6655/wpactl.c
> @@ -224,7 +224,9 @@ int wpa_set_keys(PSDevice pDevice, void *ctx,
>   } else {
>   spin_unlock_irq(>lock);
>   if (param->u.wpa_key.key &&
> - copy_from_user([0], param->u.wpa_key.key, 
> param->u.wpa_key.key_len)) {
> + copy_from_user([0],
> +(void __user *)param->u.wpa_key.key,

Would it be better to mark this pointer as __user in the structure
itself?  Or is it also used as a kernel structure in other places?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] kprobes: arm: enable OPTPROBES for arm 32

2014-08-07 Thread Masami Hiramatsu
(2014/08/08 10:25), Wang Nan wrote:
> On 2014/8/7 14:59, Masami Hiramatsu wrote:
>> (2014/08/06 15:24), Wang Nan wrote:
> +
> +static void
> +optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
> +{
> + unsigned long flags;
> +
> + regs->ARM_pc = (unsigned long)op->kp.addr;
> + regs->ARM_ORIG_r0 = ~0UL;
> +
> +
> + local_irq_save(flags);
> + /* 
> +  * This is possible if op is under delayed unoptimizing.
> +  * We need simulate the replaced instruction.
> +  */
> + if (kprobe_disabled(>kp)) {
> + struct kprobe *p = >kp;
> + op->kp.ainsn.insn_singlestep(p->opcode, >ainsn, regs);
> + } else {
> + kprobe_handler(regs);
> + }

 You don't need brace "{}" for one statement.
 By the way, why don't you call opt_pre_handler()?

>>>
>>> I use kprobe_handler because it handles instruction emulation.
>>>
>>> In addition, I'm not very sure whether skipping the complex checks
>>> in kprobe_handler() is safe or not.
>>
>> That seems to do same thing on x86. Then you should do something like
>> the optimized_callback() on x86 as below.
>>
>> static void
>> optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
>> {
>> struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
>> unsigned long flags;
>>
>> local_irq_save(flags);
>> if (kprobe_running()) {
>> kprobes_inc_nmissed_count(>kp);
> 
> In this case we still need a singlestep, right?

Ah, right! and if the singlestep requires setting up the regs->ARM_pc,
we also do that before this check. So the right code will be;

static void
optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
unsigned long flags;

local_irq_save(flags);
/* Save skipped registers */
regs->ARM_pc = (unsigned long)op->kp.addr;
regs->ARM_ORIG_r0 = ~0UL;

if (kprobe_running())
kprobes_inc_nmissed_count(>kp);
else {
__this_cpu_write(current_kprobe, >kp);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
opt_pre_handler(>kp, regs);
__this_cpu_write(current_kprobe, NULL);
}
op->kp.ainsn.insn_singlestep(op->kp.opcode, >kp.ainsn, regs);
local_irq_restore(flags);
}

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-07 Thread Yuyang Du
On Wed, Aug 06, 2014 at 11:21:35AM -0700, Jason Low wrote:
> I ran these tests with most of the AIM7 workloads to compare its
> performance between a 3.16 kernel and the kernel with these patches
> applied.
> 
> The table below contains the percent difference between the baseline
> kernel and the kernel with the patches at various user counts. A
> positive percent means the kernel with the patches performed better,
> while a negative percent means the baseline performed better.
> 
> Based on these numbers, for many of the workloads, the change was
> beneficial in those highly contended, while it had - impact in many
> of the lightly/moderately contended case (10 to 90 users).
> 
> -
>   |   10-90   |  100-1000   |  1100-2000
>   |   users   |   users |   users
> -
> alltests  |   -3.37%  |  -10.64%|   -2.25%
> -
> all_utime |   +0.33%  |   +3.73%|   +3.33%
> -
> compute   |   -5.97%  |   +2.34%|   +3.22%
> -
> custom|  -31.61%  |  -10.29%|  +15.23%
> -
> disk  |  +24.64%  |  +28.96%|  +21.28%
> -
> fserver   |   -1.35%  |   +4.82%|   +9.35%
> -
> high_systime  |   -6.73%  |   -6.28%|  +12.36%
> -
> shared|  -28.31%  |  -19.99%|   -7.10%
> -
> short |  -44.63%  |  -37.48%|  -33.62%
> -
> 
Thanks, Jason. Sorry for late response.

What about the variation of the tests? The machine you test on?

Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build failure after merge of the ext3 tree

2014-08-07 Thread Stephen Rothwell
Hi Jan,

After merging the ext3 tree, today's linux-next build (x86_64
allmodconfig) failed like this:

ERROR: "new_inode_pseudo" [fs/reiserfs/reiserfs.ko] undefined!

Caused by commit 523096294315 ("reiserfs: Avoid warning from 
unlock_new_inode()").

I have used the ext3 tree from next-20140807 for today.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


Re: [PATCH 0/5] irq / PM: Shared IRQs vs IRQF_NO_SUSPEND and suspend-to-idle wakeup

2014-08-07 Thread Rafael J. Wysocki
On Tuesday, August 05, 2014 05:22:57 PM Rafael J. Wysocki wrote:
> On Friday, August 01, 2014 04:29:40 PM Rafael J. Wysocki wrote:
> > On Friday, August 01, 2014 03:43:21 PM Thomas Gleixner wrote:
> > > On Fri, 1 Aug 2014, Rafael J. Wysocki wrote:
> > > > OK, I guess "IRQ_HANDLED from a wakeup interrupt" may be interpreted as
> > > > IRQ_HANDLED_PMWAKE.  On the other hand, if that's going to be handled in
> > > > handle_irq_event_percpu(), then using a special return code would save 
> > > > us
> > > > a brach for IRQ_HANDLED interrupts.  We could convert it to IRQ_HANDLED
> > > > immediately then.
> > > 
> > > We can handle it at the end of the function by calling
> > > note_interrupt() unconditionally do the following there:
> > > 
> > >   if (suspended) {
> > >if (ret == IRQ_NONE) {
> > >   if (shared)
> > >  yell_and_abort_or_resume();
> > >  } else {
> > >   abort_or_resume();
> > >  }
> > >   }
> > >   if (noirqdebug)
> > >return;
> > 
> > I see.
> > 
> > > > OK, I'll take a stab at the IRQF_SHARED thing if you don't mind.
> > > 
> > > Definitely not :)
> > > 
> > > > Here's my current understanding of what can be done for IRQF_NO_SUSPEND.
> > > > 
> > > > In suspend_device_irqs():
> > > > 
> > > > (1) If all actions in the list have the same setting (eg. 
> > > > IRQF_NO_SUSPEND unset),
> > > > keep the current behavior.
> > > > (2) If the actions have different settings:
> > > > - Actions with IRQF_NO_SUSPEND set are not modified.
> > > > - Actions with IRQF_NO_SUSPEND unset are switched over to a stub 
> > > > handler.
> > > > - IRQS_SUSPEND_MODE (new flag) is set for the IRQ.
> > > 
> > > Can we please do that in setup_irq() and let the shared ones always
> > > run through the stub? That keeps suspend/resume_device_irqs() simple.
> > 
> > OK
> 
> Here's a patch series based on what we talked about.
> 
> [1/5] Mechanism to wake up the system or abort suspend in progress 
> automatically.
> [2/5] Fix for shared IRQs vs IRQF_NO_SUSPEND (with wakeup in mind).
> [3/5] Wakeup interrupts support for suspend-to-idle.
> [4/5] Set IRQCHIP_SKIP_SET_WAKE for x86 IOAPIC IRQ chips.
> [5/5] Make PCIe PME wake up from suspend to idle.
> 
> All tested on MSI Wind that has a couple of issues being addressed.

Below is the doc patch I promised.

I'm not signing it off yet, as I'm sending it for comments mostly rather than
as something actually done.

Of course, it documents the situation after the series of [1-5/5] (with the
updated [3/5] I've just sent), especially as far as the suspend-to-idle goes.

Rafael


---
 Documentation/power/suspend-and-interrupts.txt |  136 +
 1 file changed, 136 insertions(+)

Index: linux-pm/Documentation/power/suspend-and-interrupts.txt
===
--- /dev/null
+++ linux-pm/Documentation/power/suspend-and-interrupts.txt
@@ -0,0 +1,136 @@
+System Suspend and Device Interrupts
+
+Copyright (C) 2014 Intel Corp.
+Author: Rafael J. Wysocki 
+
+
+Suspending and Resuming Device IRQs
+---
+
+Device interrupt request lines (IRQs) are generally disabled during system
+suspend after the "late" phase of suspending devices (that is, after all of the
+->prepare, ->suspend and ->suspend_late callbacks have been executed for all
+devices).  That is done by the suspend_device_irqs() function.
+
+The rationale for doing so is that after the "late" phase of device suspend
+there is no legitimate reason why any interrupts from suspended devices should
+trigger and if any devices have not been suspended properly yet, it is better 
to
+block interrupts from them anyway.  Also in the past we had problems with
+interrupt handlers of devices that shared IRQs with other devices and were not
+prepared for interrupts triggering after their devices had been suspended.
+In those cases they would attempt to access, for example, memory address spaces
+of suspended devices and cause unpredictable behavior to ensue as a result.
+Unfortunately, such problems are very difficult to debug and the introduction
+of suspend_device_irqs(), along with the "noirq" phase of device suspend, was
+the only practical way to prevent them from happening.
+
+Device IRQs are re-enabled during system resume, right before the "early" phase
+of resuming devices (that is, before starting to execute ->resume_early
+callbacks for devices).  The function doing that is resume_device_irqs().
+
+
+The IRQF_NO_SUSPEND Flag
+
+
+There are interrupts that can legitimately trigger during the entire system
+suspend-resume cycle, including the "noirq" phases of suspending and resuming
+devices as well as during the time when nonboot CPUs are taken offline and
+brought back online.  That applies to timer interrupts in the first place,
+but also to IPIs and to some other special-purpose interrupts, such as the ACPI
+SCI that isn't 

Re: [PATCH v4 2/4] usb: dwc2: add compatible data for rockchip soc

2014-08-07 Thread Kever.Yang


On 08/08/2014 04:52 AM, Doug Anderson wrote:

Paul,

On Thu, Aug 7, 2014 at 11:26 AM, Paul Zimmerman
 wrote:

From: Kever Yang [mailto:kever.y...@gmail.com] On Behalf Of Kever Yang
Sent: Thursday, August 07, 2014 2:35 AM

This patch add compatible data for dwc2 controller found on
rk3066, rk3188 and rk3288 processors from rockchip.

Signed-off-by: Kever Yang 
Acked-by: Paul Zimmerman 
---

Changes in v4:
- max_transfer_size change to 65536, this should be enough
   for most transfer, the hardware auto-detect will set this
   to 0x7 which may make dma_alloc_coherent fail when
   non-dword aligned buf from driver like usbnet happen.

Hi Kever,

Did you test this change thoroughly? I have vague memories of any
value above 65535 causing problems, at least on my hardware. And I
see it is set to 65535 in both pci.c and platform.c. I could be
wrong, but I thought I should mention it.

Certainly it is documented in the header file to have a max of 65535:

  * @max_transfer_size:  The maximum transfer size supported, in bytes
  *   2047 to 65,535
  *  Actual maximum value is autodetected and also
  *  the default.
Sorry for didn't check the header file, I'll change it to 65535 and 
resubmit.


...but looking at the register definition that I see, the size can be
up to 19 bits.  A 19-bit transfer far exceeds 65535.  Do you remember
what the error was?  Certainly I can imagine there being errors with
large calls to dma_alloc_coherent()...

I know that with Kever's change I can do USB Ethernet downloads, so it
is at least working to some degree.  ...to me it feels like Kever
should resubmit with 65535 (to match the documentation) and then work
in the background to figure out what the max_transfer_size really
ought to be.

You are right.

-Doug






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1] MIPS:KDUMP: set a right value to kexec_indirection_page variable

2014-08-07 Thread Yang,Wei

Ralf,

What do you think of this patch?

Thanks
Wei
On 08/04/2014 11:46 AM, Yang,Wei wrote:

ping.

BR,
Wei
On 07/31/2014 07:42 PM, wei.y...@windriver.com wrote:

From: Yang Wei 

Since there is not indirection page in crash type, so the vaule of 
the head
field of kimage structure is not equal to the address of indirection 
page but
IND_DONE. so we have to set kexec_indirection_page variable to the 
address of

the head field of image structure.

Signed-off-by: Yang Wei 

   Hi Ralf,

  Please help me take a look at this patch, I have already 
verified it on Cavium 6100EVB board.


  Thanks
  Wei
---
  arch/mips/kernel/machine_kexec.c |9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/mips/kernel/machine_kexec.c 
b/arch/mips/kernel/machine_kexec.c

index 992e184..531b70d 100644
--- a/arch/mips/kernel/machine_kexec.c
+++ b/arch/mips/kernel/machine_kexec.c
@@ -71,8 +71,13 @@ machine_kexec(struct kimage *image)
  kexec_start_address =
  (unsigned long) phys_to_virt(image->start);
  -kexec_indirection_page =
-(unsigned long) phys_to_virt(image->head & PAGE_MASK);
+if (image->type == KEXEC_TYPE_DEFAULT) {
+kexec_indirection_page =
+(unsigned long) phys_to_virt(image->head & PAGE_MASK);
+} else {
+kexec_indirection_page = (unsigned long)>head;
+}
+
memcpy((void*)reboot_code_buffer, relocate_new_kernel,
 relocate_new_kernel_size);






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Update][PATCH 3/5] irq / PM: Make wakeup interrupts wake up from suspend-to-idle

2014-08-07 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

Make IRQs enabled for system wakeup via enable_irq_wake() wake up
the system from suspend-to-idle.

For this purpose, introduce a new routine for enabling and disabling
wakeup interrupts, set_wakeup_irqs(), and make freeze_enter() call it
to enable them before starting the suspend-to-idle loop and to
disable them after that loop has been terminated.

When enabling an IRQ which is not a shared one, that routine
replaces its original handler with a stub one always returning
IRQ_NONE.  When disabling it, set_wakeup_irqs() restores the
original handler for it.  This way, IRQ_NONE is returned for all
of the wakeup interrupts during suspend-to-idle and that triggers
the abort-suspend-or-wakeup condition in note_interrupt() causing
the system to wake up.

To avoid losing wakeup events, make note_interrupt() mark wakeup
interrupts as pending before triggering wakeup for irqs_suspended
set.

Signed-off-by: Rafael J. Wysocki 
---

When I was working on the doc (that I'm going to send shortly), it
occured to me that it actually would be better to always return
IRQ_NONE from interrupt handlers for wakeup interrupts when
irqs_suspended is set - and mark them as pending to avoid losing
wakeup events.  That way they'll work uniformly, even if someone is
insane enough to use enable_irq_wake() on an IRQF_NO_SUSPEND IRQ.

Rafael

---
 include/linux/interrupt.h |1 +
 kernel/irq/pm.c   |   45 +
 kernel/irq/spurious.c |6 +-
 kernel/power/suspend.c|3 +++
 4 files changed, 54 insertions(+), 1 deletion(-)

Index: linux-pm/include/linux/interrupt.h
===
--- linux-pm.orig/include/linux/interrupt.h
+++ linux-pm/include/linux/interrupt.h
@@ -197,6 +197,7 @@ extern void irq_wake_thread(unsigned int
 /* The following three functions are for the core kernel use only. */
 extern void suspend_device_irqs(void);
 extern void resume_device_irqs(void);
+extern void set_wakeup_irqs(bool enable);
 #ifdef CONFIG_PM_SLEEP
 extern int check_wakeup_irqs(void);
 #else
Index: linux-pm/kernel/irq/pm.c
===
--- linux-pm.orig/kernel/irq/pm.c
+++ linux-pm/kernel/irq/pm.c
@@ -130,3 +130,48 @@ int check_wakeup_irqs(void)
 
return 0;
 }
+
+static irqreturn_t irq_pm_empty_handler(int irq, void *dev_id)
+{
+   return IRQ_NONE;
+}
+
+void set_wakeup_irqs(bool enable)
+{
+   struct irq_desc *desc;
+   int irq;
+
+   for_each_irq_desc(irq, desc) {
+   struct irqaction *action = desc->action;
+   unsigned long flags;
+
+   raw_spin_lock_irqsave(>lock, flags);
+
+   if (action && irqd_is_wakeup_set(>irq_data) &&
+   !desc->skip_suspend) {
+   if (enable) {
+   /*
+* Replace handlers for not shared interrupts.
+* Shared ones have wrapper handlers already.
+*/
+   if (!action->next) {
+   action->s_handler = action->handler;
+   action->handler = irq_pm_empty_handler;
+   }
+   desc->istate &= ~IRQS_SUSPENDED;
+   __enable_irq(desc, irq, false);
+   } else {
+   if (!(desc->istate & IRQS_SUSPENDED)) {
+   __disable_irq(desc, irq, false);
+   desc->istate |= IRQS_SUSPENDED;
+   }
+   if (action->handler == irq_pm_empty_handler) {
+   action->handler = action->s_handler;
+   action->s_handler = NULL;
+   }
+   }
+   }
+
+   raw_spin_unlock_irqrestore(>lock, flags);
+   }
+}
Index: linux-pm/kernel/power/suspend.c
===
--- linux-pm.orig/kernel/power/suspend.c
+++ linux-pm/kernel/power/suspend.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -55,7 +56,9 @@ static void freeze_enter(void)
 {
cpuidle_use_deepest_state(true);
cpuidle_resume();
+   set_wakeup_irqs(true);
wait_event(suspend_freeze_wait_head, suspend_freeze_wake);
+   set_wakeup_irqs(false);
cpuidle_pause();
cpuidle_use_deepest_state(false);
 }
Index: linux-pm/kernel/irq/spurious.c
===
--- linux-pm.orig/kernel/irq/spurious.c
+++ linux-pm/kernel/irq/spurious.c
@@ -277,7 +277,11 @@ void note_interrupt(unsigned int irq, st
   

Re: linux-next: build failure after merge of the modules tree

2014-08-07 Thread Stephen Rothwell
Hi Rusty,

On Thu, 07 Aug 2014 22:37:50 +0930 Rusty Russell  wrote:
>
> Stephen Rothwell  writes:
> > Hi Rusty,
> >
> > On Thu, 07 Aug 2014 21:07:20 +0930 Rusty Russell  
> > wrote:
> >>
> >> Ah, crap.  I really hate that macro magic :(
> >> 
> >> I amended to the minimal fix, so we can pretend that never happened.
> >
> > How about arch/powerpc/platforms/powernv/opal-dump.c as well?
> 
> Hmm, I didn't see that one.

It was in the original report, but I guess a bit hidden by all the guff
the gcc spews out these days.

> I've done that as a separate patch.

Thanks.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


signature.asc
Description: PGP signature


Re: [PATCH 2/4] perf, x86: Don't mark DataLA addresses as store

2014-08-07 Thread Stephane Eranian
On Fri, Aug 8, 2014 at 2:15 AM, Andi Kleen  wrote:
> From: Andi Kleen 
>
> Haswell supports reporting the data address for a range
> of PEBS events, including:
>
> UOPS_RETIRED.ALL
> MEM_UOPS_RETIRED.STLB_MISS_LOADS
> MEM_UOPS_RETIRED.STLB_MISS_STORES
> MEM_UOPS_RETIRED.LOCK_LOADS
> MEM_UOPS_RETIRED.SPLIT_LOADS
> MEM_UOPS_RETIRED.SPLIT_STORES
> MEM_UOPS_RETIRED.ALL_LOADS
> MEM_UOPS_RETIRED.ALL_STORES
> MEM_LOAD_UOPS_RETIRED.L1_HIT
> MEM_LOAD_UOPS_RETIRED.L2_HIT
> MEM_LOAD_UOPS_RETIRED.L3_HIT
> MEM_LOAD_UOPS_RETIRED.L1_MISS
> MEM_LOAD_UOPS_RETIRED.L2_MISS
> MEM_LOAD_UOPS_RETIRED.L3_MISS
> MEM_LOAD_UOPS_RETIRED.HIT_LFB
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
> MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM
>
> This facility was already enabled earlier with the original Haswell
> perf changes.
>
> However these addresses were always reports as stores by perf, which is wrong,
> as they could be loads or NA too.
>
> This patch uses the load/store/na flags added earlier to report the correct
> operation based on the event type. For some events this is NA.
>
> v2: Supports load/stores/na again instead of marking everything NA
> Signed-off-by: Andi Kleen 
> ---
>  arch/x86/kernel/cpu/perf_event_intel_ds.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
> b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index aca77e9..855c19e 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -108,13 +108,19 @@ static u64 precise_store_data(u64 status)
> return val;
>  }
>
> -static u64 precise_store_data_hsw(struct perf_event *event, u64 status)
> +static u64 precise_store_data_hsw(struct perf_event *event, u64 status,
> + unsigned flags)
>  {
> union perf_mem_data_src dse;
> u64 cfg = event->hw.config & INTEL_ARCH_EVENT_MASK;
>
> dse.val = 0;
No. This is not valid. You are not initializing all the field later on.
Needs to be:
  dse.val = PERF_MEM_NA
(which is defined in my patch).

> -   dse.mem_op = PERF_MEM_OP_STORE;
> +   if (flags & PERF_X86_EVENT_PEBS_LD_HSW)
> +   dse.mem_op = PERF_MEM_OP_LOAD;
> +   else if (flags & PERF_X86_EVENT_PEBS_ST_HSW)
> +   dse.mem_op = PERF_MEM_OP_STORE;
> +   else
> +   dse.mem_op = PERF_MEM_OP_NA;
> dse.mem_lvl = PERF_MEM_LVL_NA;
>
But you have LOCK, SNOOP TLB, 
So it is better to initialize globally dse.val once, and then patch
based on what you can gather.

I will repost the whole series with my fixes and cleanups.

> /*
> @@ -868,7 +874,8 @@ static void __intel_pmu_pebs_event(struct perf_event 
> *event,
>  PERF_X86_EVENT_PEBS_LD_HSW|
>  PERF_X86_EVENT_PEBS_NA_HSW))
> data.data_src.val =
> -   precise_store_data_hsw(event, 
> pebs->dse);
> +   precise_store_data_hsw(event, 
> pebs->dse,
> +  
> event->hw.flags);
> else
> data.data_src.val = 
> precise_store_data(pebs->dse);
> }
> --
> 1.9.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops: 17 SMP ARM (v3.16-rc2)

2014-08-07 Thread Troy Kisky
On 8/7/2014 7:38 AM, Fabio Estevam wrote:
> On Thu, Aug 7, 2014 at 11:20 AM, Fabio Estevam  wrote:
> 
> ,but I am wondering if we should also do:
> 
> --- a/arch/arm/boot/dts/imx6qdl-sabreauto.dtsi
> +++ b/arch/arm/boot/dts/imx6qdl-sabreauto.dtsi
> @@ -66,6 +66,7 @@
> pinctrl-0 = <_enet>;
> phy-mode = "rgmii";
> interrupts-extended = < 6 IRQ_TYPE_LEVEL_HIGH>,
> + < 0 118 IRQ_TYPE_LEVEL_HIGH>,
>   < 0 119 IRQ_TYPE_LEVEL_HIGH>;
> status = "okay";
>  };
> @@ -226,7 +227,7 @@
> MX6QDL_PAD_RGMII_RD2__RGMII_RD2 
> 0x1b0b0
> MX6QDL_PAD_RGMII_RD3__RGMII_RD3 
> 0x1b0b0
> MX6QDL_PAD_RGMII_RX_CTL__RGMII_RX_CTL   
> 0x1b0b0
> -   MX6QDL_PAD_GPIO_6__ENET_IRQ 
> 0x000b1
> +   MX6QDL_PAD_GPIO_6__ENET_IRQ
>  0x40b1
> 
> Since the Workaround for erratum ERR006687 states that the SION bit
> needs to be used:
> 
> "All of the interrupts can be selected by MUX and output to pad GPIO6.
> If GPIO6 is selected to
> output ENET interrupts and GPIO6 SION is set, the resulting GPIO
> interrupt will wake the system
> from Wait mode."
> 
arch/arm/boot/dts/imx6q-pinfunc.h:#define MX6QDL_PAD_GPIO_6__ENET_IRQ   
0x230 0x600
0x03c 0x11 0xff000609

So, the ion bit should already be set(0x11). But the other way works too.


Troy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix race in get_request()

2014-08-07 Thread Jörn Engel
Hello Jens!

I came across the below while investigating some other problem.
Something here doesn't seem right.  This looks like an obvious bug and
something roughly along the lines of my patch would fix it.  But I
must be in the wrong decade to find such a bug in the block layer.

Is this for real?  Or if not, what am I missing?

Jörn

--

If __get_request() returns NULL, get_request will call
prepare_to_wait_exclusive() followed by io_schedule().  Not rechecking
the sleep condition after prepare_to_wait_exclusive() leaves a race
where the condition changes before prepare_to_wait_exclusive(), but
not after and accordingly this thread never gets woken up.

The race must be exceedingly hard to hit, otherwise I cannot explain how
such a classic race could outlive the last millenium.

Signed-off-by: Joern Engel 
---
 block/blk-core.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 3275353957f0..00aa6c7abe5a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1068,6 +1068,11 @@ retry:
 
trace_block_sleeprq(q, bio, rw_flags & 1);
 
+   rq = __get_request(rl, rw_flags, bio, gfp_mask);
+   if (rq) {
+   finish_wait(>wait[is_sync], );
+   return rq;
+   }
spin_unlock_irq(q->queue_lock);
io_schedule();
 
-- 
2.0.0.rc0.1.g7b2ba98

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] kprobes: arm: enable OPTPROBES for arm 32

2014-08-07 Thread Wang Nan
On 2014/8/7 14:59, Masami Hiramatsu wrote:
> (2014/08/06 15:24), Wang Nan wrote:
 +
 +static void
 +optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
 +{
 +  unsigned long flags;
 +
 +  regs->ARM_pc = (unsigned long)op->kp.addr;
 +  regs->ARM_ORIG_r0 = ~0UL;
 +
 +
 +  local_irq_save(flags);
 +  /* 
 +   * This is possible if op is under delayed unoptimizing.
 +   * We need simulate the replaced instruction.
 +   */
 +  if (kprobe_disabled(>kp)) {
 +  struct kprobe *p = >kp;
 +  op->kp.ainsn.insn_singlestep(p->opcode, >ainsn, regs);
 +  } else {
 +  kprobe_handler(regs);
 +  }
>>>
>>> You don't need brace "{}" for one statement.
>>> By the way, why don't you call opt_pre_handler()?
>>>
>>
>> I use kprobe_handler because it handles instruction emulation.
>>
>> In addition, I'm not very sure whether skipping the complex checks
>> in kprobe_handler() is safe or not.
> 
> That seems to do same thing on x86. Then you should do something like
> the optimized_callback() on x86 as below.
> 
> static void
> optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
> {
> struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> unsigned long flags;
> 
> local_irq_save(flags);
> if (kprobe_running()) {
> kprobes_inc_nmissed_count(>kp);

In this case we still need a singlestep, right?

> } else {
> /* Save skipped registers */
> regs->ARM_pc = (unsigned long)op->kp.addr;
> regs->ARM_ORIG_r0 = ~0UL;
> 
> __this_cpu_write(current_kprobe, >kp);
> kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> opt_pre_handler(>kp, regs);
> __this_cpu_write(current_kprobe, NULL);
>   op->kp.ainsn.insn_singlestep(op->kp.opcode, >kp.ainsn, 
> regs);
> }
> local_irq_restore(flags);
> }
> 
> Thank you,
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] arch,locking: Ciao arch_mutex_cpu_relax()

2014-08-07 Thread Davidlohr Bueso
On Tue, 2014-08-05 at 10:42 -0700, Davidlohr Bueso wrote:
> On Tue, 2014-08-05 at 15:04 +0200, Geert Uytterhoeven wrote:
> > It looks like you forgot to update frv?  It's been failing on -next since a
> > few days:

Anyway developers can be alerted sooner about this (ie: while its still
in -next phase), like automated emails or something? This would be extra
nice for those archs that are harder to get tested.

> > kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> > function 'cpu_relax_lowlatency'
> > [-Werror=implicit-function-declaration]
> > cc1: some warnings being treated as errors
> > kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of
> > function 'cpu_relax_lowlatency'
> > [-Werror=implicit-function-declaration]
> > make[3]: *** [kernel/locking/mcs_spinlock.o] Error 1
> > cc1: some warnings being treated as errors
> > make[3]: *** [kernel/locking/mutex.o] Error 1
> > 
> > http://kisskb.ellerman.id.au/kisskb/buildresult/11616307/
> 
> Ah, indeed. Thanks for the report, afaict this was the only missing
> arch .

Adding Guenter who also reported this yesterday.

Linus, since this is build-breaking an entire arch, it might be worth
avoiding the whole -tip thing and get the fix in as soon as possible.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] SELinux/NetLabel fixes for 3.17

2014-08-07 Thread Paul Moore
Hello All,

Two small patches to fix a couple of build warnings in SELinux and NetLabel.  
The patches are obvious enough that I don't think any additional explanation 
is necessary, but it basically boils down to the usual: I was stupid, and 
these patches fix some of the stupid.

Both patches were posted earlier this week to the SELinux list, and that is 
where they sat as I didn't think there were noteworthy enough to go upstream 
at this point in time, but DaveM would rather see them upstream now so who am 
I to argue.  As the patches are both very small, I'm including the combines 
changes with this pull request in case anyone wants to scrutinize them any 
further.

Enjoy,
-Paul

--
The following changes since commit 4fbe63d1c773cceef3fe1f6ed0c9c268f4f24760:

  netlabel: shorter names for the NetLabel catmap funcs/structs
(2014-08-01 11:17:37 -0400)

are available in the git repository at:

  git://git.infradead.org/users/pcmoore/selinux stable-3.17

for you to fetch changes up to 942ba3646543aeb3e5729c35d10ac43424bf0b68:

  selinux: remove unused variabled in the netport, netnode, and netif
   caches (2014-08-07 20:55:30 -0400)


Paul Moore (2):
  netlabel: fix the netlbl_catmap_setlong() dummy function
  selinux: remove unused variabled in the netport, netnode, and netif
   caches

 include/net/netlabel.h | 8 
 security/selinux/netif.c   | 4 ++--
 security/selinux/netnode.c | 3 +--
 security/selinux/netport.c | 3 +--
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/net/netlabel.h b/include/net/netlabel.h
index a4fc39b..7b5a300 100644
--- a/include/net/netlabel.h
+++ b/include/net/netlabel.h
@@ -524,10 +524,10 @@ static inline int netlbl_catmap_setrng(struct 
netlbl_lsm_catmap **catmap,
 {
return 0;
 }
-static int netlbl_catmap_setlong(struct netlbl_lsm_catmap **catmap,
-u32 offset,
-unsigned long bitmap,
-gfp_t flags)
+static inline int netlbl_catmap_setlong(struct netlbl_lsm_catmap **catmap,
+   u32 offset,
+   unsigned long bitmap,
+   gfp_t flags)
 {
return 0;
 }
diff --git a/security/selinux/netif.c b/security/selinux/netif.c
index 3c3de4c..50ce177 100644
--- a/security/selinux/netif.c
+++ b/security/selinux/netif.c
@@ -272,7 +272,7 @@ static struct notifier_block sel_netif_netdev_notifier = {
 
 static __init int sel_netif_init(void)
 {
-   int i, err;
+   int i;
 
if (!selinux_enabled)
return 0;
@@ -282,7 +282,7 @@ static __init int sel_netif_init(void)
 
register_netdevice_notifier(_netif_netdev_notifier);
 
-   return err;
+   return 0;
 }
 
 __initcall(sel_netif_init);
diff --git a/security/selinux/netnode.c b/security/selinux/netnode.c
index ddf3152..da923f8 100644
--- a/security/selinux/netnode.c
+++ b/security/selinux/netnode.c
@@ -303,7 +303,6 @@ void sel_netnode_flush(void)
 static __init int sel_netnode_init(void)
 {
int iter;
-   int ret;
 
if (!selinux_enabled)
return 0;
@@ -313,7 +312,7 @@ static __init int sel_netnode_init(void)
sel_netnode_hash[iter].size = 0;
}
 
-   return ret;
+   return 0;
 }
 
 __initcall(sel_netnode_init);
diff --git a/security/selinux/netport.c b/security/selinux/netport.c
index 73ac678..3311cc3 100644
--- a/security/selinux/netport.c
+++ b/security/selinux/netport.c
@@ -237,7 +237,6 @@ void sel_netport_flush(void)
 static __init int sel_netport_init(void)
 {
int iter;
-   int ret;
 
if (!selinux_enabled)
return 0;
@@ -247,7 +246,7 @@ static __init int sel_netport_init(void)
sel_netport_hash[iter].size = 0;
}
 
-   return ret;
+   return 0;
 }
 
 __initcall(sel_netport_init);

-- 
paul moore
security and virtualization @ redhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: dwc2: Read GNPTXFSIZ when in forced HOST mode.

2014-08-07 Thread Kever Yang

Doug:

On 08/08/2014 03:48 AM, Doug Anderson wrote:

The documentation for GNPTXFSIZ says that "For host mode, this field
is always valid."  Since we're already switching to host mode for
HPTXFSIZ, let's also read GNPTXFSIZ in host mode.

On an rk3288 SoC, without this change we see this at bootup:
   dwc2 ff58.usb: gnptxfsiz=00100400
   dwc2 ff58.usb: 128 invalid for host_nperio_tx_fifo_size. Check HW 
configuration.

After this change we see:
   dwc2 ff58.usb: gnptxfsiz=04000400
Yeap, that is the problem cause the log you shown in rk3288-evb and 
further more

cause fifo setting fail.

I was plan to commit this patch just the same as you did.
It's great that you also find out the problem and send this patch.


Signed-off-by: Doug Anderson 
---
  drivers/usb/dwc2/core.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index 27d2c9b..c184ed43 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -2674,23 +2674,23 @@ int dwc2_get_hwparams(struct dwc2_hsotg *hsotg)
hwcfg2 = readl(hsotg->regs + GHWCFG2);
hwcfg3 = readl(hsotg->regs + GHWCFG3);
hwcfg4 = readl(hsotg->regs + GHWCFG4);
-   gnptxfsiz = readl(hsotg->regs + GNPTXFSIZ);
grxfsiz = readl(hsotg->regs + GRXFSIZ);
  
  	dev_dbg(hsotg->dev, "hwcfg1=%08x\n", hwcfg1);

dev_dbg(hsotg->dev, "hwcfg2=%08x\n", hwcfg2);
dev_dbg(hsotg->dev, "hwcfg3=%08x\n", hwcfg3);
dev_dbg(hsotg->dev, "hwcfg4=%08x\n", hwcfg4);
-   dev_dbg(hsotg->dev, "gnptxfsiz=%08x\n", gnptxfsiz);
dev_dbg(hsotg->dev, "grxfsiz=%08x\n", grxfsiz);
  
-	/* Force host mode to get HPTXFSIZ exact power on value */

+   /* Force host mode to get HPTXFSIZ / GNPTXFSIZ exact power on value */
gusbcfg = readl(hsotg->regs + GUSBCFG);
gusbcfg |= GUSBCFG_FORCEHOSTMODE;
writel(gusbcfg, hsotg->regs + GUSBCFG);
usleep_range(10, 15);
  
+	gnptxfsiz = readl(hsotg->regs + GNPTXFSIZ);

hptxfsiz = readl(hsotg->regs + HPTXFSIZ);
+   dev_dbg(hsotg->dev, "gnptxfsiz=%08x\n", gnptxfsiz);
dev_dbg(hsotg->dev, "hptxfsiz=%08x\n", hptxfsiz);
gusbcfg = readl(hsotg->regs + GUSBCFG);
gusbcfg &= ~GUSBCFG_FORCEHOSTMODE;
There may be a potential problem still need to fix, the grxfsiz may have 
being changed,
the bootrom and uboot will change this value if they use the dwc2 
controller.
The way we get the register value here can not make sure this is the 
power-on

value which we actually need.

Let me do more test for that, and maybe we need another patch.

Anyway, this patch works and reasonable.

Reviewed-by: Kever Yang 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sel_netif_init: 'err' is used uninitialized

2014-08-07 Thread Paul Moore
On Thursday, August 07, 2014 12:04:54 PM David Miller wrote:
> From: Paul Moore 
> Date: Thu, 07 Aug 2014 10:26:14 -0400
> 
> > On Thursday, August 07, 2014 12:31:15 PM Geert Uytterhoeven wrote:
> > 
> >> 
> >> security/selinux/netif.c: In function ‘sel_netif_init’:
> >> /scratch/geert/linux/linux-m68k/security/selinux/netif.c:285: warning:
> >> ‘err’ is used uninitialized in this function
> >> 
> >> Should it just return 0, like before?
> >> Or should it return the return value of register_netdevice_notifier()
> >> instead, which also returns an error code? Or is that failure
> >> non-critical?
> > 
> > Hi,
> > 
> > I posted a fix for this two days ago to the SELinux list (see below).  As
> > soon as -rc1 is released and linux-next is back in business I'll be
> > pushing the patch to the SELinux #next branch.
> > 
> >  * http://marc.info/?l=selinux=140727033030054=2
> 
> With respect to this and the lack-of-inline warning fix we spoke about
> yesterday, why are you waiting and only pushing such bug fixes into your
> "next" branch?

Simply put, I didn't think the patches were significant enough to push at this 
point in time.
 
> Those sort of things should be sent to Linus now to correct the errors
> introduced during the merge window, as I have done last night for all
> of the networking merge fallout.

I'll (re)post the patches with a pull request in just a moment, CC'ing all the 
various mailing lists and you guys can figure out who best to merge them.

-- 
paul moore
www.paul-moore.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] usb: dwc2: Read GNPTXFSIZ when in forced HOST mode.

2014-08-07 Thread Paul Zimmerman
> From: diand...@google.com [mailto:diand...@google.com] On Behalf Of Doug 
> Anderson
> Sent: Thursday, August 07, 2014 5:12 PM
> 
> On Thu, Aug 7, 2014 at 1:18 PM, Paul Zimmerman
>  wrote:
> >> From: Doug Anderson [mailto:diand...@chromium.org]
> >> Sent: Thursday, August 07, 2014 12:48 PM
> >>
> >> The documentation for GNPTXFSIZ says that "For host mode, this field
> >> is always valid."  Since we're already switching to host mode for
> >> HPTXFSIZ, let's also read GNPTXFSIZ in host mode.
> >>
> >> On an rk3288 SoC, without this change we see this at bootup:
> >>   dwc2 ff58.usb: gnptxfsiz=00100400
> >>   dwc2 ff58.usb: 128 invalid for host_nperio_tx_fifo_size. Check HW 
> >> configuration.
> >>
> >> After this change we see:
> >>   dwc2 ff58.usb: gnptxfsiz=04000400
> >>
> >> Signed-off-by: Doug Anderson 
> >> ---
> >>  drivers/usb/dwc2/core.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
> >> index 27d2c9b..c184ed43 100644
> >> --- a/drivers/usb/dwc2/core.c
> >> +++ b/drivers/usb/dwc2/core.c
> >> @@ -2674,23 +2674,23 @@ int dwc2_get_hwparams(struct dwc2_hsotg *hsotg)
> >>   hwcfg2 = readl(hsotg->regs + GHWCFG2);
> >>   hwcfg3 = readl(hsotg->regs + GHWCFG3);
> >>   hwcfg4 = readl(hsotg->regs + GHWCFG4);
> >> - gnptxfsiz = readl(hsotg->regs + GNPTXFSIZ);
> >>   grxfsiz = readl(hsotg->regs + GRXFSIZ);
> >>
> >>   dev_dbg(hsotg->dev, "hwcfg1=%08x\n", hwcfg1);
> >>   dev_dbg(hsotg->dev, "hwcfg2=%08x\n", hwcfg2);
> >>   dev_dbg(hsotg->dev, "hwcfg3=%08x\n", hwcfg3);
> >>   dev_dbg(hsotg->dev, "hwcfg4=%08x\n", hwcfg4);
> >> - dev_dbg(hsotg->dev, "gnptxfsiz=%08x\n", gnptxfsiz);
> >>   dev_dbg(hsotg->dev, "grxfsiz=%08x\n", grxfsiz);
> >>
> >> - /* Force host mode to get HPTXFSIZ exact power on value */
> >> + /* Force host mode to get HPTXFSIZ / GNPTXFSIZ exact power on value 
> >> */
> >>   gusbcfg = readl(hsotg->regs + GUSBCFG);
> >>   gusbcfg |= GUSBCFG_FORCEHOSTMODE;
> >>   writel(gusbcfg, hsotg->regs + GUSBCFG);
> >>   usleep_range(10, 15);
> >>
> >> + gnptxfsiz = readl(hsotg->regs + GNPTXFSIZ);
> >>   hptxfsiz = readl(hsotg->regs + HPTXFSIZ);
> >> + dev_dbg(hsotg->dev, "gnptxfsiz=%08x\n", gnptxfsiz);
> >>   dev_dbg(hsotg->dev, "hptxfsiz=%08x\n", hptxfsiz);
> >>   gusbcfg = readl(hsotg->regs + GUSBCFG);
> >>   gusbcfg &= ~GUSBCFG_FORCEHOSTMODE;
> >
> > Nice! I wonder if this is a bug in the original driver, and they
> > actually meant to read this register instead of HPTXFSIZ? Well, it
> > doesn't really matter I guess.
> >
> > Acked-by: Paul Zimmerman 
> >
> > You may want to resend this to Greg after -rc1 is out and he reopens
> > his usb-next tree.
> 
> Thanks!
> 
> ...do you think this is a fix that needs to go in for 3.17?  Is it
> affecting anyone there?

I don't think so. It only has an effect if the controller starts up in
peripheral mode and then later switches to host mode, right? I think
yours is the first dwc2 platform in the kernel with that capability.

-- 
Paul



Re: [PATCH v4 2/4] usb: dwc2: add compatible data for rockchip soc

2014-08-07 Thread Kever Yang

Paul,

On 08/08/2014 02:26 AM, Paul Zimmerman wrote:

From: Kever Yang [mailto:kever.y...@gmail.com] On Behalf Of Kever Yang
Sent: Thursday, August 07, 2014 2:35 AM

This patch add compatible data for dwc2 controller found on
rk3066, rk3188 and rk3288 processors from rockchip.

Signed-off-by: Kever Yang 
Acked-by: Paul Zimmerman 
---

Changes in v4:
- max_transfer_size change to 65536, this should be enough
   for most transfer, the hardware auto-detect will set this
   to 0x7 which may make dma_alloc_coherent fail when
   non-dword aligned buf from driver like usbnet happen.

Hi Kever,

Did you test this change thoroughly? I have vague memories of any
value above 65535 causing problems, at least on my hardware. And I
see it is set to 65535 in both pci.c and platform.c. I could be
wrong, but I thought I should mention it.
I test it on rk3288 evb, it works find with 65536, I'm sorry for didn't 
mention it in my patch.
The problem in my platform is if the value use hardware auto-detect, it 
will be 0x7,

and that will cause the dma_alloc_coherent fail in hcd driver.

The value less than 0x7 should be fine for hardware, but for the 
software, it depends on

how we use it.

What kind of problem did you met? Software problem or hardware problem? 
Maybe I should

pay more attention for this value. :)





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup

2014-08-07 Thread Davidlohr Bueso
On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
> On a highly contended rwsem, spinlock contention due to the slow
> rwsem_wake() call can be a significant portion of the total CPU cycles
> used. With writer lock stealing and writer optimistic spinning, there
> is also a pretty good chance that the lock may have been stolen
> before the waker wakes up the waiters. The woken tasks, if any,
> will have to go back to sleep again.

Good catch! And this applies to mutexes as well. How about something
like this:

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index dadbf88..e037588 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -707,6 +707,20 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
 
 #endif
 
+#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+   struct task_struct *owner = ACCESS_ONCE(lock->owner);
+
+   return owner != NULL;
+}
+#else
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+   return false;
+}
+#endif
+
 /*
  * Release the lock, slowpath:
  */
@@ -734,6 +748,15 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int 
nested)
mutex_release(>dep_map, nested, _RET_IP_);
debug_mutex_unlock(lock);
 
+   /*
+* Abort the wakeup operation if there is an active writer as the
+* lock was stolen. mutex_unlock() should have cleared the owner field
+* before calling this function. If that field is now set, there must
+* be an active writer present.
+*/
+   if (mutex_has_owner(lock))
+   goto done;
+
if (!list_empty(>wait_list)) {
/* get the first entry from the wait-list: */
struct mutex_waiter *waiter =
@@ -744,7 +767,7 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int 
nested)
 
wake_up_process(waiter->task);
}
-
+done:
spin_unlock_mutex(>wait_lock, flags);
 }
 






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] perf, x86: Fix haswell mem hierarchy flags reporting

2014-08-07 Thread Andi Kleen
From: Andi Kleen 

This fixes a bug introduced with

commit 722e76e60f2775c21b087ff12c5e678cf0ebcaaf
Author: Stephane Eranian 
Date:   Thu May 15 17:56:44 2014 +0200

fix Haswell precise store data source encoding

When returning early we need to return the complete value of the
memory hierarchy, not just the mem_lvl. Otherwise any load/store/na
flags set early get lost.

Signed-off-by: Andi Kleen 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 855c19e..8096d24 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -121,7 +121,6 @@ static u64 precise_store_data_hsw(struct perf_event *event, 
u64 status,
dse.mem_op = PERF_MEM_OP_STORE;
else
dse.mem_op = PERF_MEM_OP_NA;
-   dse.mem_lvl = PERF_MEM_LVL_NA;
 
/*
 * L1 info only valid for following events:
@@ -131,8 +130,10 @@ static u64 precise_store_data_hsw(struct perf_event 
*event, u64 status,
 * MEM_UOPS_RETIRED.SPLIT_STORES
 * MEM_UOPS_RETIRED.ALL_STORES
 */
-   if (cfg != 0x12d0 && cfg != 0x22d0 && cfg != 0x42d0 && cfg != 0x82d0)
-   return dse.mem_lvl;
+   if (cfg != 0x12d0 && cfg != 0x22d0 && cfg != 0x42d0 && cfg != 0x82d0) {
+   dse.mem_lvl = PERF_MEM_LVL_NA;
+   return dse.val;
+   }
 
if (status & 1)
dse.mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_HIT;
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Update new perf PEBS tables patchkit

2014-08-07 Thread Andi Kleen
This patchkit revamps the PEBS tables on most Intel CPUs,
both simplifying and fixing a couple of problems on Haswell:
- All PEBS events are supported
- The address and data source is now always correctly reported.
It also fixes one regression added earlier.

I added the fix from Stephane to correctly initialize the
data source default for other events. No open issues in
this version, should be good to merge.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] perf: Fix the default data source

2014-08-07 Thread Andi Kleen
From: Stephane Eranian 

- the default of 0 for the data source in perf_sample_data_init() was
  wrong. 0 is not a valid value. So defined PERF_MEM_NA (not available)

Signed-off-by: Stephane Eranian 
Signed-off-by: Andi Kleen 
---
 include/linux/perf_event.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 707617a..8b206aa 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -604,6 +604,13 @@ struct perf_sample_data {
u64 txn;
 };
 
+/* default value for data source */
+#define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
+   PERF_MEM_S(LVL, NA)   |\
+   PERF_MEM_S(SNOOP, NA) |\
+   PERF_MEM_S(LOCK, NA)  |\
+   PERF_MEM_S(TLB, NA))
+
 static inline void perf_sample_data_init(struct perf_sample_data *data,
 u64 addr, u64 period)
 {
@@ -616,7 +623,7 @@ static inline void perf_sample_data_init(struct 
perf_sample_data *data,
data->regs_user.regs = NULL;
data->stack_user_size = 0;
data->weight = 0;
-   data->data_src.val = 0;
+   data->data_src.val = PERF_MEM_NA;
data->txn = 0;
 }
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] perf, x86: Revamp PEBS event selection

2014-08-07 Thread Andi Kleen
From: Andi Kleen 

The basic idea is that it does not make sense to list all PEBS
events individually. The list is very long, sometimes outdated
and the hardware doesn't need it. If an event does not support
PEBS it will just not count, there is no security issue.

We need to only list events that something special, like
supporting load or store addresses.

This vastly simplifies the PEBS event selection. It also
speeds up the scheduling because the scheduler doesn't
have to walk as many constraints.

Bugs fixed:
- We do not allow setting forbidden flags with PEBS anymore
(SDM 18.9.4), except for the special cycle event.
This is done using a new constraint macro that also
matches on the event flags.
- Correct DataLA and load/store/na flags reporting on Haswell
[Requires a followon patch]
- We did not allow all PEBS events on Haswell:
We were missing some valid subevents in d1-d2 (MEM_LOAD_UOPS_RETIRED.*,
MEM_LOAD_UOPS_RETIRED_L3_HIT_RETIRED.*)

This includes the changes proposed by Stephane earlier and obsoletes
his patchkit (except for some changes on pre Sandy Bridge/Silvermont
CPUs)

I only did Sandy Bridge and Silvermont and later so far, mostly because these
are the parts I could directly confirm the hardware behavior with hardware
architects. Also I do not believe the older CPUs have any
missing events in their PEBS list, so there's no pressing
need to change them.

I did not implement the flag proposed by Peter to allow
setting forbidden flags. If really needed this could
be implemented on to of this patch.

Cc: eran...@google.com
v2: Fix broken store events on SNB/IVB (Stephane Eranian)
v3: More fixes. Rename some arguments (Stephane Eranian)
v4: List most Haswell events individually again to report
memory operation type correctly.
Add new flags to describe load/store/na for datala.
Update description.
Signed-off-by: Andi Kleen 
---
 arch/x86/include/asm/perf_event.h |   8 +++
 arch/x86/kernel/cpu/perf_event.h  |  48 --
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 107 ++
 3 files changed, 85 insertions(+), 78 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h 
b/arch/x86/include/asm/perf_event.h
index 8249df4..8dfc9fd 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -51,6 +51,14 @@
 ARCH_PERFMON_EVENTSEL_EDGE  |  \
 ARCH_PERFMON_EVENTSEL_INV   |  \
 ARCH_PERFMON_EVENTSEL_CMASK)
+#define X86_ALL_EVENT_FLAGS\
+   (ARCH_PERFMON_EVENTSEL_EDGE |   \
+ARCH_PERFMON_EVENTSEL_INV |\
+ARCH_PERFMON_EVENTSEL_CMASK |  \
+ARCH_PERFMON_EVENTSEL_ANY |\
+ARCH_PERFMON_EVENTSEL_PIN_CONTROL |\
+HSW_IN_TX |\
+HSW_IN_TX_CHECKPOINTED)
 #define AMD64_RAW_EVENT_MASK   \
(X86_RAW_EVENT_MASK  |  \
 AMD64_EVENTSEL_EVENT)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 1ee4e76..e6bdcb7 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -77,8 +77,10 @@ struct event_constraint {
  */
 #define PERF_X86_EVENT_PEBS_LDLAT  0x1 /* ld+ldlat data address sampling */
 #define PERF_X86_EVENT_PEBS_ST 0x2 /* st data address sampling */
-#define PERF_X86_EVENT_PEBS_ST_HSW 0x4 /* haswell style st data sampling */
+#define PERF_X86_EVENT_PEBS_ST_HSW 0x4 /* haswell style datala, store */
 #define PERF_X86_EVENT_COMMITTED   0x8 /* event passed commit_txn */
+#define PERF_X86_EVENT_PEBS_LD_HSW 0x10 /* haswell style datala, load */
+#define PERF_X86_EVENT_PEBS_NA_HSW 0x20 /* haswell style datala, unknown */
 
 struct amd_nb {
int nb_id;  /* NorthBridge id */
@@ -262,18 +264,52 @@ struct cpu_hw_events {
EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK)
 
 #define INTEL_PLD_CONSTRAINT(c, n) \
-   __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
+   __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
   HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LDLAT)
 
 #define INTEL_PST_CONSTRAINT(c, n) \
-   __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
+   __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
  HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST)
 
-/* DataLA version of store sampling without extra enable bit. */
-#define INTEL_PST_HSW_CONSTRAINT(c, n) \
-   __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
+/* Event constraint, but match on all event flags too. */
+#define INTEL_FLAGS_EVENT_CONSTRAINT(c, n) \
+   EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS)
+
+/* Check only flags, but allow all event/umask */
+#define INTEL_ALL_EVENT_CONSTRAINT(code, n)\
+   EVENT_CONSTRAINT(code, n, X86_ALL_EVENT_FLAGS)
+
+/* Check flags and event code, and set the HSW store flag */
+#define 

[PATCH 2/4] perf, x86: Don't mark DataLA addresses as store

2014-08-07 Thread Andi Kleen
From: Andi Kleen 

Haswell supports reporting the data address for a range
of PEBS events, including:

UOPS_RETIRED.ALL
MEM_UOPS_RETIRED.STLB_MISS_LOADS
MEM_UOPS_RETIRED.STLB_MISS_STORES
MEM_UOPS_RETIRED.LOCK_LOADS
MEM_UOPS_RETIRED.SPLIT_LOADS
MEM_UOPS_RETIRED.SPLIT_STORES
MEM_UOPS_RETIRED.ALL_LOADS
MEM_UOPS_RETIRED.ALL_STORES
MEM_LOAD_UOPS_RETIRED.L1_HIT
MEM_LOAD_UOPS_RETIRED.L2_HIT
MEM_LOAD_UOPS_RETIRED.L3_HIT
MEM_LOAD_UOPS_RETIRED.L1_MISS
MEM_LOAD_UOPS_RETIRED.L2_MISS
MEM_LOAD_UOPS_RETIRED.L3_MISS
MEM_LOAD_UOPS_RETIRED.HIT_LFB
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM

This facility was already enabled earlier with the original Haswell
perf changes.

However these addresses were always reports as stores by perf, which is wrong,
as they could be loads or NA too.

This patch uses the load/store/na flags added earlier to report the correct
operation based on the event type. For some events this is NA.

v2: Supports load/stores/na again instead of marking everything NA
Signed-off-by: Andi Kleen 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index aca77e9..855c19e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -108,13 +108,19 @@ static u64 precise_store_data(u64 status)
return val;
 }
 
-static u64 precise_store_data_hsw(struct perf_event *event, u64 status)
+static u64 precise_store_data_hsw(struct perf_event *event, u64 status,
+ unsigned flags)
 {
union perf_mem_data_src dse;
u64 cfg = event->hw.config & INTEL_ARCH_EVENT_MASK;
 
dse.val = 0;
-   dse.mem_op = PERF_MEM_OP_STORE;
+   if (flags & PERF_X86_EVENT_PEBS_LD_HSW)
+   dse.mem_op = PERF_MEM_OP_LOAD;
+   else if (flags & PERF_X86_EVENT_PEBS_ST_HSW)
+   dse.mem_op = PERF_MEM_OP_STORE;
+   else
+   dse.mem_op = PERF_MEM_OP_NA;
dse.mem_lvl = PERF_MEM_LVL_NA;
 
/*
@@ -868,7 +874,8 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
 PERF_X86_EVENT_PEBS_LD_HSW|
 PERF_X86_EVENT_PEBS_NA_HSW))
data.data_src.val =
-   precise_store_data_hsw(event, 
pebs->dse);
+   precise_store_data_hsw(event, pebs->dse,
+  event->hw.flags);
else
data.data_src.val = 
precise_store_data(pebs->dse);
}
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] LED updates for 3.17

2014-08-07 Thread Bryan Wu
Hi Linus,

This cycle we got:
 - a fix of attribute-creation race for the whole leds subsystem
 - new drivers (HID:GT683R, leds-ipaq-micro)
 - other fixing and clean up.

Please consider the following changes since commit
a497c3ba1d97fc69c1e78e7b96435ba8c2cb42ee:

  Linux 3.16-rc2 (2014-06-21 19:02:54 -1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds.git for-next

for you to fetch changes up to e661c8978e4833d4148d08b405a2f3175d6f97d9:

  leds: ipaq-micro: fix sparse non static symbol warning (2014-07-29
10:57:20 -0700)


Janne Kanniainen (3):
  HID: add support for MSI GT683R led panels
  HID: gt683r: fix race condition
  HID: gt683r: move mode attribute to led-class devices

Johan Hovold (13):
  leds: add led-class attribute-group support
  leds: lm3550: fix attribute-creation race
  leds: lm3533: fix attribute-creation race
  leds: lm355x: fix attribute-creation race
  leds: lm3642: fix attribute-creation race
  leds: max8997: fix attribute-creation race
  leds: netxbig: fix attribute-creation race
  leds: ns2: fix attribute-creation race
  leds: ss4200: fix attribute-creation race
  leds: wm831x-status: fix attribute-creation race
  input: lm8323: fix attribute-creation race
  leds: lp55xx-common: fix sysfs entry leak
  leds: lp55xx-common: fix attribute-creation race

Linus Walleij (1):
  leds: add driver for the iPAQ micro

Marek Belisko (1):
  Documentation: dts: tcs6507: Fix wrong statement about #gpio-cells

Peter Meerwald (3):
  leds:pca963x: Add support for PCA9635 LED driver chip
  leds:pca963x: Always initialize MODE2 register
  leds:pca963x: Update for PCA9635 and correct statement about
MODE2 OUTDRV default

Vincent Donnefort (1):
  leds: convert blink timer to workqueue

Wei Yongjun (1):
  leds: ipaq-micro: fix sparse non static symbol warning

 Documentation/ABI/testing/sysfs-class-leds-gt683r  |  16 +
 Documentation/devicetree/bindings/leds/pca963x.txt |   9 +-
 Documentation/devicetree/bindings/leds/tca6507.txt |   2 +-
 drivers/hid/Kconfig|  14 +
 drivers/hid/Makefile   |   1 +
 drivers/hid/hid-core.c |   1 +
 drivers/hid/hid-gt683r.c   | 321 +
 drivers/hid/hid-ids.h  |   2 +-
 drivers/hid/usbhid/hid-quirks.c|   2 +-
 drivers/input/keyboard/lm8323.c|  22 +-
 drivers/leds/Kconfig   |   7 +
 drivers/leds/Makefile  |   1 +
 drivers/leds/led-class.c   |  19 +-
 drivers/leds/led-core.c|  11 +-
 drivers/leds/leds-ipaq-micro.c | 141 +
 drivers/leds/leds-lm3530.c |  20 +-
 drivers/leds/leds-lm3533.c |  20 +-
 drivers/leds/leds-lm355x.c |  21 +-
 drivers/leds/leds-lm3642.c |  30 +-
 drivers/leds/leds-lp55xx-common.c  |  20 +-
 drivers/leds/leds-max8997.c|  16 +-
 drivers/leds/leds-netxbig.c|  31 +-
 drivers/leds/leds-ns2.c|  16 +-
 drivers/leds/leds-pca963x.c|  28 +-
 drivers/leds/leds-ss4200.c |  14 +-
 drivers/leds/leds-wm831x-status.c  |  23 +-
 include/linux/leds.h   |   5 +-
 27 files changed, 644 insertions(+), 169 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-class-leds-gt683r
 create mode 100644 drivers/hid/hid-gt683r.c
 create mode 100644 drivers/leds/leds-ipaq-micro.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: dwc2: Read GNPTXFSIZ when in forced HOST mode.

2014-08-07 Thread Doug Anderson
Paul,

On Thu, Aug 7, 2014 at 1:18 PM, Paul Zimmerman
 wrote:
>> From: Doug Anderson [mailto:diand...@chromium.org]
>> Sent: Thursday, August 07, 2014 12:48 PM
>>
>> The documentation for GNPTXFSIZ says that "For host mode, this field
>> is always valid."  Since we're already switching to host mode for
>> HPTXFSIZ, let's also read GNPTXFSIZ in host mode.
>>
>> On an rk3288 SoC, without this change we see this at bootup:
>>   dwc2 ff58.usb: gnptxfsiz=00100400
>>   dwc2 ff58.usb: 128 invalid for host_nperio_tx_fifo_size. Check HW 
>> configuration.
>>
>> After this change we see:
>>   dwc2 ff58.usb: gnptxfsiz=04000400
>>
>> Signed-off-by: Doug Anderson 
>> ---
>>  drivers/usb/dwc2/core.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
>> index 27d2c9b..c184ed43 100644
>> --- a/drivers/usb/dwc2/core.c
>> +++ b/drivers/usb/dwc2/core.c
>> @@ -2674,23 +2674,23 @@ int dwc2_get_hwparams(struct dwc2_hsotg *hsotg)
>>   hwcfg2 = readl(hsotg->regs + GHWCFG2);
>>   hwcfg3 = readl(hsotg->regs + GHWCFG3);
>>   hwcfg4 = readl(hsotg->regs + GHWCFG4);
>> - gnptxfsiz = readl(hsotg->regs + GNPTXFSIZ);
>>   grxfsiz = readl(hsotg->regs + GRXFSIZ);
>>
>>   dev_dbg(hsotg->dev, "hwcfg1=%08x\n", hwcfg1);
>>   dev_dbg(hsotg->dev, "hwcfg2=%08x\n", hwcfg2);
>>   dev_dbg(hsotg->dev, "hwcfg3=%08x\n", hwcfg3);
>>   dev_dbg(hsotg->dev, "hwcfg4=%08x\n", hwcfg4);
>> - dev_dbg(hsotg->dev, "gnptxfsiz=%08x\n", gnptxfsiz);
>>   dev_dbg(hsotg->dev, "grxfsiz=%08x\n", grxfsiz);
>>
>> - /* Force host mode to get HPTXFSIZ exact power on value */
>> + /* Force host mode to get HPTXFSIZ / GNPTXFSIZ exact power on value */
>>   gusbcfg = readl(hsotg->regs + GUSBCFG);
>>   gusbcfg |= GUSBCFG_FORCEHOSTMODE;
>>   writel(gusbcfg, hsotg->regs + GUSBCFG);
>>   usleep_range(10, 15);
>>
>> + gnptxfsiz = readl(hsotg->regs + GNPTXFSIZ);
>>   hptxfsiz = readl(hsotg->regs + HPTXFSIZ);
>> + dev_dbg(hsotg->dev, "gnptxfsiz=%08x\n", gnptxfsiz);
>>   dev_dbg(hsotg->dev, "hptxfsiz=%08x\n", hptxfsiz);
>>   gusbcfg = readl(hsotg->regs + GUSBCFG);
>>   gusbcfg &= ~GUSBCFG_FORCEHOSTMODE;
>
> Nice! I wonder if this is a bug in the original driver, and they
> actually meant to read this register instead of HPTXFSIZ? Well, it
> doesn't really matter I guess.
>
> Acked-by: Paul Zimmerman 
>
> You may want to resend this to Greg after -rc1 is out and he reopens
> his usb-next tree.

Thanks!

...do you think this is a fix that needs to go in for 3.17?  Is it
affecting anyone there?

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: fix timeval conversion to jiffies

2014-08-07 Thread Andrew Hunter
timeval_to_jiffies rounding was broken.  It essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) >> USEC_JIFFIE_SC

and it rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value: rounding the division up by adding 2^USEC_JIFFIE_SC - 1, when
the error in scaling was added in, was sufficient to add one jiffie to
the final result.

In particular, with HZ=1000, we consistently computed that 1 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC. This is obviously bad as a general rule, and caused
observable user problems with setitimer() at the very least:

setitimer(ITIMER_PROF, , NULL);
setitimer(ITIMER_PROF, NULL, );

would actually add a tick to val!

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 1;
  setitimer(ITIMER_PROF, , NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i < 10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, , );
/* on old kernels, this goes up by TICK_USEC every iteration */
printf("previous value: %ld %ld %ld %ld\n",
   prev.it_interval.tv_sec, prev.it_interval.tv_usec,
   prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, , NULL);
  }
return 0;
}

Signed-off-by: Andrew Hunter 
Reviewed-by: Paul Turner 
Reported-by: Aaron Jacobs 
Change-Id: I7cd0f0764847fd055d39531f54e6ea3dd3ce5453
---
 include/linux/jiffies.h | 12 ---
 kernel/time.c   | 54 +++--
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 1f44466..c367cbd 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -258,23 +258,11 @@ extern unsigned long preset_lpj;
 #define SEC_JIFFIE_SC (32 - SHIFT_HZ)
 #endif
 #define NSEC_JIFFIE_SC (SEC_JIFFIE_SC + 29)
-#define USEC_JIFFIE_SC (SEC_JIFFIE_SC + 19)
 #define SEC_CONVERSION ((unsigned long)u64)NSEC_PER_SEC << SEC_JIFFIE_SC) 
+\
 TICK_NSEC -1) / (u64)TICK_NSEC))
 
 #define NSEC_CONVERSION ((unsigned long)u64)1 << NSEC_JIFFIE_SC) +\
 TICK_NSEC -1) / (u64)TICK_NSEC))
-#define USEC_CONVERSION  \
-((unsigned long)u64)NSEC_PER_USEC << USEC_JIFFIE_SC) +\
-TICK_NSEC -1) / (u64)TICK_NSEC))
-/*
- * USEC_ROUND is used in the timeval to jiffie conversion.  See there
- * for more details.  It is the scaled resolution rounding value.  Note
- * that it is a 64-bit value.  Since, when it is applied, we are already
- * in jiffies (albit scaled), it is nothing but the bits we will shift
- * off.
- */
-#define USEC_ROUND (u64)(((u64)1 << USEC_JIFFIE_SC) - 1)
 /*
  * The maximum jiffie value is (MAX_INT >> 1).  Here we translate that
  * into seconds.  The 64-bit case will overflow if we are not careful,
diff --git a/kernel/time.c b/kernel/time.c
index 7c7964c..3c49ab4 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -496,17 +496,20 @@ EXPORT_SYMBOL(usecs_to_jiffies);
  * that a remainder subtract here would not do the right thing as the
  * resolution values don't fall on second boundries.  I.e. the line:
  * nsec -= nsec % TICK_NSEC; is NOT a correct resolution rounding.
+ * Note that due to the small error in the multiplier here, this
+ * rounding is incorrect for sufficiently large values of tv_nsec, but
+ * well formed timespecs should have tv_nsec < NSEC_PER_SEC, so we're
+ * OK.
  *
  * Rather, we just shift the bits off the right.
  *
  * The >> (NSEC_JIFFIE_SC - SEC_JIFFIE_SC) converts the scaled nsec
  * value to a scaled second value.
  */
-unsigned long
-timespec_to_jiffies(const struct timespec *value)
+static unsigned long
+__timespec_to_jiffies(unsigned long sec, long nsec)
 {
-   unsigned long sec = value->tv_sec;
-   long nsec = value->tv_nsec + TICK_NSEC - 1;
+   nsec = nsec + TICK_NSEC - 1;
 
if (sec >= MAX_SEC_IN_JIFFIES){
sec = MAX_SEC_IN_JIFFIES;
@@ -517,6 +520,13 @@ timespec_to_jiffies(const struct timespec *value)
 (NSEC_JIFFIE_SC - SEC_JIFFIE_SC))) >> SEC_JIFFIE_SC;
 
 }
+
+unsigned long

[PATCH] perf, x86: Fix :pp without LBR

2014-08-07 Thread Andi Kleen
From: Andi Kleen 

This fixes a side effect of Kan's earlier patch to probe the LBRs at boot
time. Normally when the LBRs are disabled cycles:pp is disabled too.
So for example cycles:pp doesn't work.

However this is not needed with PEBSv2 and later (Haswell) because
it does not need LBRs to correct the IP-off-by-one.

So add an extra check for PEBSv2 that also allows :pp

Signed-off-by: Andi Kleen 
---
 arch/x86/kernel/cpu/perf_event.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 2879ecd..0646d3b 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -387,7 +387,7 @@ int x86_pmu_hw_config(struct perf_event *event)
precise++;
 
/* Support for IP fixup */
-   if (x86_pmu.lbr_nr)
+   if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 
2)
precise++;
}
 
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] libsas: modify SATA error handler

2014-08-07 Thread Dan Williams
[ adding yuxia...@marvell.com ]

On Tue, Jun 3, 2014 at 6:41 PM, Dan Williams  wrote:
> Hi, some notes below:
>
> On Thu, Apr 24, 2014 at 6:27 AM, Xiangliang Yu  wrote:
>> Add support for SATA port softreset and port multiplier error
>> handling.
>
> Some more detailed notes about the approach and any caveats would be
> appreciated.
>
>>
>> Signed-off-by: Xiangliang Yu 
>> ---
>>  drivers/scsi/libsas/sas_ata.c |  226 
>> -
>>  include/scsi/libsas.h |6 +
>>  2 files changed, 231 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
>> index 766098a..29a19fd 100644
>> --- a/drivers/scsi/libsas/sas_ata.c
>> +++ b/drivers/scsi/libsas/sas_ata.c
>> @@ -442,6 +442,226 @@ static int sas_ata_hard_reset(struct ata_link *link, 
>> unsigned int *class,
>> return ret;
>>  }
>>
>> +static void sas_ata_freeze(struct ata_port *ap)
>> +{
>> +   struct domain_device *dev = ap->private_data;
>> +   struct sas_ha_struct *sas_ha = dev->port->ha;
>> +   struct Scsi_Host *host = sas_ha->core.shost;
>> +   struct sas_internal *i = to_sas_internal(host->transportt);
>> +
>> +   if (i->dft->lldd_dev_freeze)
>> +   i->dft->lldd_dev_freeze(dev);
>> +}
>> +
>> +static void sas_ata_thaw(struct ata_port *ap)
>> +{
>> +   struct domain_device *dev = ap->private_data;
>> +   struct sas_ha_struct *sas_ha = dev->port->ha;
>> +   struct Scsi_Host *host = sas_ha->core.shost;
>> +   struct sas_internal *i = to_sas_internal(host->transportt);
>> +
>> +   if (i->dft->lldd_dev_thaw)
>> +   i->dft->lldd_dev_thaw(dev);
>> +}
>> +
>> +static int sas_ata_wait_task_done(struct sas_task *task, unsigned long 
>> timeout,
>> +   int (*check_done)(struct sas_task *task))
>> +{
>
> Why do we need a custom check_done() routine?  Since we have a
> sas_task we can use the normal completion infrastructure.  See
> smp_execute_task().
>
>> +   struct ata_port *ap = task->uldd_task;
>> +   unsigned long deadline;
>> +   int done;
>> +
>> +   if (!check_done) {
>> +   SAS_DPRINTK("check function is null.\n");
>> +   return -1;
>> +   }
>> +
>> +   deadline = ata_deadline(jiffies, timeout);
>> +   done = check_done(task);
>> +
>> +   while (done && time_before(jiffies, deadline)) {
>> +   ata_msleep(ap, 1);
>> +
>> +   done = check_done(task);
>
> This can simply be:
>
> completion_done(>slow_task->completion)
>
>> +   }
>> +
>> +   return done;
>> +}
>> +
>> +static int sas_ata_exec_polled_cmd(struct ata_port *ap, struct ata_taskfile 
>> *tf,
>> +   int pmp, unsigned long timeout)
>> +{
>> +   struct domain_device *dev = ap->private_data;
>> +   struct sas_ha_struct *sas_ha = dev->port->ha;
>> +   struct Scsi_Host *host = sas_ha->core.shost;
>> +   struct sas_internal *i = to_sas_internal(host->transportt);
>> +   struct sas_task *task = NULL;
>> +   int ret = -1;
>> +
>> +   if (!i->dft->lldd_execute_task) {
>> +   SAS_DPRINTK("execute function is null.\n");
>> +   return ret;
>> +   }
>> +
>> +   task = sas_alloc_task(GFP_ATOMIC);
>
> I think this can be downgraded to GFP_NOIO.  We're in a sleepable context.
>
>> +   if (!task) {
>> +   SAS_DPRINTK("failed to alloc sas task.\n");
>> +   goto fail;
>> +   }
>> +
>> +   task->dev = dev;
>> +   task->task_proto = SAS_PROTOCOL_SATA;
>> +   task->uldd_task = ap;
>> +
>> +   ata_tf_to_fis(tf, pmp, 0, (u8 *)>ata_task.fis);
>> +   task->ata_task.retry_count = 1;
>> +   task->task_state_flags = SAS_TASK_STATE_PENDING;
>> +   task->task_state_flags |= SAS_TASK_NEED_DEV_RESET;
>> +
>> +   ret = i->dft->lldd_execute_task(task, 1, GFP_ATOMIC);
>
> Same here.
>
>> +   if (ret) {
>> +   SAS_DPRINTK("failed to send internal task.\n");
>> +   goto fail;
>> +   }
>> +
>> +   if (timeout) {
>> +   ret = sas_ata_wait_task_done(task, timeout,
>> +   i->dft->lldd_wait_task_done);
>> +   if (ret) {
>> +   SAS_DPRINTK("get wrong status.\n");
>> +   goto fail;
>> +   }
>> +   }
>> +   list_del_init(>list);
>> +   sas_free_task(task);
>> +
>> +   return 0;
>> +fail:
>> +   if (task) {
>> +   list_del_init(>list);
>> +   sas_free_task(task);
>> +   }
>> +
>> +   return ret;
>> +}
>> +
>> +static int sas_ata_soft_reset(struct ata_link *link, unsigned int *class,
>> + unsigned long deadline)
>> +{
>> +   struct ata_taskfile tf;
>> +   struct ata_port *ap = link->ap;
>> +   struct domain_device *dev = ap->private_data;
>> +   struct sas_ha_struct *sas_ha = dev->port->ha;
>> +   struct Scsi_Host 

Re: [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin

2014-08-07 Thread Davidlohr Bueso
On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
> v1->v2:
>  - Remove patch 1 which changes preempt_enable() to
>preempt_enable_no_resched().
>  - Remove the RWSEM_READ_OWNED macro and assume readers own the lock
>when owner is NULL.
>  - Reduce the spin threshold to 64.

So I still don't like this, and the fact that it is used in some
virtualization locking bits doesn't really address the concerns about
arbitrary logic in our general locking code.

Also, why did you reduce it from 100 to 64? This very much wants to be
commented.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the modules tree

2014-08-07 Thread Rusty Russell
Stephen Rothwell  writes:
> Hi Rusty,
>
> On Thu, 07 Aug 2014 21:07:20 +0930 Rusty Russell  
> wrote:
>>
>> Ah, crap.  I really hate that macro magic :(
>> 
>> I amended to the minimal fix, so we can pretend that never happened.
>
> How about arch/powerpc/platforms/powernv/opal-dump.c as well?

Hmm, I didn't see that one.

I've done that as a separate patch.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >