date:20150517

Re: [PATCH v7 2/3] I2C: mediatek: Add driver for MediaTek I2C controller

2015-05-17 Thread Eddie Huang

Hi Wolfram,

Narrow down CC-list.
Please see my reply below.

On Tue, 2015-05-12 at 14:58 +0200, w...@the-dreams.de wrote:
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> 
> Please sort the includes to avoid duplicates.
OK, will fix it.

> 
> > +struct mtk_i2c_compatible {
> > +   const struct i2c_adapter_quirks *quirks;
> > +   unsigned char pmic_i2c;
> > +   unsigned char dcm;
> > +};
> 
> I wonder if the unsigned char options can be bits in a flags variable?

OK, will fix it.

> 
> > +static const struct i2c_adapter_quirks mt6577_i2c_quirks = {
> > +   .flags = I2C_AQ_COMB_WRITE_THEN_READ,
> > +   .max_num_msgs = MAX_MSG_NUM_MT6577,
> > +   .max_write_len = MAX_DMA_TRANS_SIZE_MT6577,
> > +   .max_read_len = MAX_DMA_TRANS_SIZE_MT6577,
> > +   .max_comb_1st_msg_len = MAX_DMA_TRANS_SIZE_MT6577,
> > +   .max_comb_2nd_msg_len = MAX_WRRD_TRANS_SIZE_MT6577,
> > +};
> 
> I would think plain numbers are much more readable than defines here.
> They are used only once, too.
> 
OK, will fix it.

> > +static const struct of_device_id mtk_i2c_of_match[] = {
> > +   { .compatible = "mediatek,mt6577-i2c", .data = (void *)_compat },
> > +   { .compatible = "mediatek,mt6589-i2c", .data = (void *)_compat },
> > +   {}
> > +};
> > +MODULE_DEVICE_TABLE(of, mtk_i2c_of_match);
> 
> No need for casts.
> 
OK, will fix it.

> > +static inline void mtk_i2c_writew(u16 value, struct mtk_i2c *i2c, u8 
> > offset)
> > +{
> > +   writew(value, i2c->base + offset);
> > +}
> > +
> > +static inline u16 mtk_i2c_readw(struct mtk_i2c *i2c, u8 offset)
> > +{
> > +   return readw(i2c->base + offset);
> > +}
> 
> I am not a big fan of such extremly thin wrappers, but if you like them...
> 
OK, will fix it.

> > +   rpaddr = dma_map_single(i2c->adap.dev.parent, msgs->buf,
> > +   msgs->len, DMA_FROM_DEVICE);
> 
> I think you shouldn't use the adapter device here and later, but the dma
> channel device.
> 
In MTK SoC, each I2C controller has its own DMA, and this DMA can't be
used by other hardware.
So I tend to use DMA directly, not through DMA channel.
Even so, "i2c->adap.dev.parent" is not suitable here. I will change to
use i2c->dev here. (Reference i2c-at91.c).

> > +   /* flush before sending start */
> > +   mb();
> > +   mtk_i2c_writel_dma(I2C_DMA_START_EN, i2c, OFFSET_EN);
> 
> Is mb() really needed when you use writel after that?
> 
mb() should be removed here.

> > +   if (i2c->irq_stat & (I2C_HS_NACKERR | I2C_ACKERR)) {
> > +   dev_dbg(i2c->dev, "addr: %x, transfer ACK error\n", msgs->addr);
> > +   mtk_i2c_init_hw(i2c);
> > +   return -EREMOTEIO;
> 
> -ENXIO. Please check Documentation/i2c/fault-codes for a reference and
> check all your error values.
> 
OK, I will check return values.

> > +   ret = of_property_read_u32(np, "clock-div", clk_src_div);
> > +   if (ret < 0)
> > +   return ret;
> 
> Do we need a property for that in the i2c-block? Can't the clock
> provider driver deliver a properly divided clock already?
> 
Actually, the clock divider implement in I2C controller self.
Clock provider don't have such knowledge to know the divider. The
divider may not be the same in different chip, this is why we put
"clock-div" in device tree. 

> 
> And cppcheck says:
> 
> drivers/i2c/busses/i2c-mt65xx.c:133: style: struct or union member 
> 'mtk_i2c_data::clk_frequency' is never used.
> drivers/i2c/busses/i2c-mt65xx.c:135: style: struct or union member 
> 'mtk_i2c_data::clk_src_div' is never used.
> 
OK, I will remove them.

Thanks your review.
Eddie




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] f2fs: reserve space for tmpfile

2015-05-17 Thread Chao Yu

Add missed f2fs_balance_fs to reserve space for ->tmpfile.

Signed-off-by: Chao Yu 
---
 fs/f2fs/namei.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index bed0cb0..38a783e 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -517,6 +517,9 @@ static int __f2fs_tmpfile(struct inode *dir, struct dentry 
*dentry,
struct inode *inode;
int err;
 
+   if (!whiteout)
+   f2fs_balance_fs(sbi);
+
inode = f2fs_new_inode(dir, mode);
if (IS_ERR(inode))
return PTR_ERR(inode);
-- 
2.3.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] f2fs: support RENAME_WHITEOUT

2015-05-17 Thread Chao Yu

As the description of rename in manual, RENAME_WHITEOUT is a special operation
that only makes sense for overlay/union type filesystem.

When performing rename with RENAME_WHITEOUT, dst will be replace with src, and
meanwhile, a 'whiteout' will be create with name of src.

A "whiteout" is designed to be a char device with 0,0 device number, it has
specially meaning for stackable filesystem. In these filesystems, there are
multiple layers exist, and only top of these can be modified. So a whiteout
in top layer is used to hide a corresponding file in lower layer, as well
removal of whiteout will make the file appear.

Now in overlayfs, when we rename a file which is exist in lower layer, it
will be copied up to upper if it is not on upper layer yet, and then rename
it on upper layer, source file will be whiteouted to hide corresponding file
in lower layer at the same time.

So in upper layer filesystem, implementation of RENAME_WHITEOUT provide a
atomic operation for stackable filesystem to support rename operation.

There are multiple ways to implement RENAME_WHITEOUT in log of this commit:
7dcf5c3e4527 ("xfs: add RENAME_WHITEOUT support") which pointed out by
Dave Chinner.

For now, we just try to follow the way that xfs/ext4 use.

Signed-off-by: Chao Yu 
---
 fs/f2fs/namei.c | 140 
 1 file changed, 90 insertions(+), 50 deletions(-)

diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 16b74da..bed0cb0 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -510,14 +510,80 @@ out:
return err;
 }
 
+static int __f2fs_tmpfile(struct inode *dir, struct dentry *dentry,
+   umode_t mode, struct inode **whiteout)
+{
+   struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
+   struct inode *inode;
+   int err;
+
+   inode = f2fs_new_inode(dir, mode);
+   if (IS_ERR(inode))
+   return PTR_ERR(inode);
+
+   if (whiteout) {
+   init_special_inode(inode, inode->i_mode, WHITEOUT_DEV);
+   inode->i_op = _special_inode_operations;
+   } else {
+   inode->i_op = _file_inode_operations;
+   inode->i_fop = _file_operations;
+   inode->i_mapping->a_ops = _dblock_aops;
+   }
+
+   f2fs_lock_op(sbi);
+   err = acquire_orphan_inode(sbi);
+   if (err)
+   goto out;
+
+   err = f2fs_do_tmpfile(inode, dir);
+   if (err)
+   goto release_out;
+
+   /*
+* add this non-linked tmpfile to orphan list, in this way we could
+* remove all unused data of tmpfile after abnormal power-off.
+*/
+   add_orphan_inode(sbi, inode->i_ino);
+   f2fs_unlock_op(sbi);
+
+   alloc_nid_done(sbi, inode->i_ino);
+
+   if (whiteout) {
+   inode_dec_link_count(inode);
+   *whiteout = inode;
+   } else {
+   d_tmpfile(dentry, inode);
+   }
+   unlock_new_inode(inode);
+   return 0;
+
+release_out:
+   release_orphan_inode(sbi);
+out:
+   handle_failed_inode(inode);
+   return err;
+}
+
+static int f2fs_tmpfile(struct inode *dir, struct dentry *dentry, umode_t mode)
+{
+   return __f2fs_tmpfile(dir, dentry, mode, NULL);
+}
+
+static int f2fs_create_whiteout(struct inode *dir, struct inode **whiteout)
+{
+   return __f2fs_tmpfile(dir, NULL, S_IFCHR | WHITEOUT_MODE, whiteout);
+}
+
 static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
-   struct inode *new_dir, struct dentry *new_dentry)
+   struct inode *new_dir, struct dentry *new_dentry,
+   unsigned int flags)
 {
struct f2fs_sb_info *sbi = F2FS_I_SB(old_dir);
struct inode *old_inode = d_inode(old_dentry);
struct inode *new_inode = d_inode(new_dentry);
+   struct inode *whiteout = NULL;
struct page *old_dir_page;
-   struct page *old_page, *new_page;
+   struct page *old_page, *new_page = NULL;
struct f2fs_dir_entry *old_dir_entry = NULL;
struct f2fs_dir_entry *old_entry;
struct f2fs_dir_entry *new_entry;
@@ -543,6 +609,12 @@ static int f2fs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
goto out_old;
}
 
+   if (flags & RENAME_WHITEOUT) {
+   err = f2fs_create_whiteout(old_dir, );
+   if (err)
+   goto out_dir;
+   }
+
if (new_inode) {
 
err = -ENOTEMPTY;
@@ -611,8 +683,17 @@ static int f2fs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
 
f2fs_delete_entry(old_entry, old_page, old_dir, NULL);
 
+   if (whiteout) {
+   whiteout->i_state |= I_LINKABLE;
+   set_inode_flag(F2FS_I(whiteout), FI_INC_LINK);
+   err = f2fs_add_link(old_dentry, whiteout);
+   if (err)
+   goto put_out_dir;
+

Re: [PATCH v2] clk: mediatek: Initialize clk_init_data

2015-05-17 Thread Sascha Hauer

Hi Ricky,

On Mon, May 18, 2015 at 11:41:49AM +0800, Ricky Liang wrote:
> The variable init (struct clk_init_data) is allocated on the stack.
> We weren't initializing the .flags field, so it contains random junk,
> which can cause all kinds of interesting issues when the flags are
> parsed by clk_register.

It seems we have the same problem in clk-gate.c aswell. We do initialize
do .flags field there, so this is no urgent problem, but we might get a
real problem when additional fields are added to struct clk_init_data.
Care to fix that aswell along with this patch?

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/14] ceph: Use kvfree() in ceph_put_page_vector()

2015-05-17 Thread Yan, Zheng


> On May 16, 2015, at 02:35, Pekka Enberg  wrote:
> 
> Use kvfree instead of open-coding it.
> 
> Cc: "Yan, Zheng" 
> Cc: Sage Weil 
> Signed-off-by: Pekka Enberg 
> ---
> net/ceph/pagevec.c | 5 +
> 1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
> index 096d914..d4f5f22 100644
> --- a/net/ceph/pagevec.c
> +++ b/net/ceph/pagevec.c
> @@ -51,10 +51,7 @@ void ceph_put_page_vector(struct page **pages, int 
> num_pages, bool dirty)
>   set_page_dirty_lock(pages[i]);
>   put_page(pages[i]);
>   }
> - if (is_vmalloc_addr(pages))
> - vfree(pages);
> - else
> - kfree(pages);
> + kvfree(pages);
> }
> EXPORT_SYMBOL(ceph_put_page_vector);

Thanks. but Ilya Dryomov has already submitted a similar patch.

> 
> -- 
> 2.1.0
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 4/5] samples/bpf: Add proper prefix to objects in Makefile

2015-05-17 Thread He Kuang

Always use $(obj) when referring to generated files and use $(src) when
referring to files located in the src tree.

Signed-off-by: He Kuang 
---
 samples/bpf/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 76e3458..8fdbd73 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -44,7 +44,7 @@ HOSTLOADLIBES_tracex4 += -lelf -lrt
 # point this to your LLVM backend with bpf support
 LLC=$(srctree)/tools/bpf/llvm/bld/Debug+Asserts/bin/llc
 
-%.o: %.c
+$(obj)/%.o: $(src)/%.c
clang $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
-D__KERNEL__ -Wno-unused-value -Wno-pointer-sign \
-O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v6 1/5] random: Blocking API for accessing nonblocking_pool

2015-05-17 Thread Stephan Mueller

Am Freitag, 15. Mai 2015, 14:46:26 schrieb Herbert Xu:

Hi Herbert,

> On Wed, May 13, 2015 at 09:54:41PM +0200, Stephan Mueller wrote:
> >  /*
> > 
> > + * Equivalent function to get_random_bytes with the difference that this
> > + * function blocks the request until the nonblocking_pool is initialized.
> > + */
> > +void get_blocking_random_bytes(void *buf, int nbytes)
> > +{
> > +   if (unlikely(nonblocking_pool.initialized == 0))
> > +   wait_event_interruptible(urandom_init_wait,
> > +nonblocking_pool.initialized);
> > +   extract_entropy(_pool, buf, nbytes, 0, 0);
> 
> So what if the wait was interrupted? You are going to extract
> entropy from an empty pool.
> 
> Anyway, you still haven't addressed my primary concern with this
> model which is the potential for dead-lock.  Sleeping for an open
> period of time like this in a work queue is bad form.  It may also
> lead to dead-locks if whatever you're waiting for happened to use
> the same work thread.
> 
> That's why I think you should simply provide a function and data
> pointer which random.c can then stash onto a list to call when
> the pool is ready.

Thanks for the hint to the list. Before handing in another formal patch, may i 
ask for checking the following approach? I would think that this one should 
cover your concerns.

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 9cd6968..9bc2a57 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -409,6 +409,19 @@ static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
 static DECLARE_WAIT_QUEUE_HEAD(urandom_init_wait);
 static struct fasync_struct *fasync;
 
+static LIST_HEAD(random_wait_list);
+static DEFINE_MUTEX(random_wait_list_mutex);
+struct random_work {
+   struct list_headlist;
+   struct work_struct  rw_work;
+   void*rw_buf;
+   int rw_len;
+   void*rw_private;
+   void(*rw_cb)(void *buf, int buflen,
+void *private);
+};
+static void process_random_waiters(void);
+
 /**
  *
  * OS independent entropy store.   Here are the functions which handle
@@ -660,6 +673,7 @@ retry:
r->entropy_total = 0;
if (r == _pool) {
prandom_reseed_late();
+   process_random_waiters();
wake_up_interruptible(_init_wait);
pr_notice("random: %s pool is initialized\n", r->name);
}
@@ -1778,3 +1792,64 @@ void add_hwgenerator_randomness(const char *buffer, 
size_t count,
credit_entropy_bits(poolp, entropy);
 }
 EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
+
+static void process_random_waiters(void)
+{
+   struct random_work *rw = NULL;
+
+   mutex_lock(_wait_list_mutex);
+   while (!list_empty(_wait_list)) {
+   rw = list_first_entry(_wait_list, struct random_work,
+ list);
+   list_del(>list);
+   schedule_work(>rw_work);
+   }
+   mutex_unlock(_wait_list_mutex);
+}
+
+static void get_blocking_random_bytes_work(struct work_struct *work)
+{
+   struct random_work *rw = container_of(work, struct random_work,
+ rw_work);
+
+   get_random_bytes(rw->rw_buf, rw->rw_len);
+   rw->rw_cb(rw->rw_buf, rw->rw_len, rw->rw_private);
+   kfree(rw);
+}
+
+/*
+ * Equivalent function to get_random_bytes with the difference that this
+ * function blocks the request until the nonblocking_pool is initialized.
+ */
+int get_blocking_random_bytes_cb(void *buf, int nbytes, void *private,
+void (*cb)(void *buf, int buflen,
+   void *private))
+{
+   struct random_work *rw = NULL;
+   int ret = 0;
+
+   mutex_lock(_wait_list_mutex);
+   list_for_each_entry(rw, _wait_list, list)
+   if (buf == rw->rw_buf)
+   goto out;
+
+   rw = kmalloc(sizeof(struct random_work), GFP_KERNEL);
+   if (!rw) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   INIT_WORK(>rw_work, get_blocking_random_bytes_work);
+   rw->rw_buf = buf;
+   rw->rw_len = nbytes;
+   rw->rw_private = private;
+   rw->rw_cb = cb;
+   list_add_tail(>list, _wait_list);
+
+out:
+   mutex_unlock(_wait_list_mutex);
+   if (nonblocking_pool.initialized)
+   process_random_waiters();
+
+   return ret;
+}
+EXPORT_SYMBOL(get_blocking_random_bytes_cb);
diff --git a/include/linux/random.h b/include/linux/random.h
index b05856e..b57525f 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -15,6 +15,9 @@ extern void add_interrupt_randomness(int irq, int irq_flags);
 
 extern void get_random_bytes(void

[RFC PATCH 5/5] samples/bpf: Add sample for testing bpf fetch args

2015-05-17 Thread He Kuang

Sample code for testing bpf fetch args.

Works as following steps:

  $ perf bpf record --object sample_bpf_fetch_args.o -- dd if=/dev/zero 
of=/mnt/data/test bs=4k count=3

show result in ringbuffer:
  $ perf script
  dd  1088 [000]  5740.260451: perf_bpf_probe:generic_perform_write: 
(811308ea) a_ops=0x81a20160 bytes=0x1000 
page=0x88007c621540 pos=0
  dd  1088 [000]  5740.260451: perf_bpf_probe:generic_perform_write: 
(811308ea) a_ops=0x81a20160 bytes=0x1000 
page=0xea0001c49f40 pos=4096
  dd  1088 [000]  5740.260451: perf_bpf_probe:generic_perform_write: 
(811308ea) a_ops=0x81a20160 bytes=0x1000 
page=0xea0001c49f80 pos=8192

show result in bpf prog:
  $ cat /sys/kernel/debug/tracing/trace |grep dd
  dd-1098  [000] d...  6892.829003: : NODE_write1 a_ops=81a20160, 
bytes=1000
  dd-1098  [000] d...  6892.829049: : NODE_write2 page =88007c621540, pos  
=  (null)
  dd-1098  [000] d...  6892.829650: : NODE_write1 a_ops=81a20160, 
bytes=1000
  dd-1098  [000] d...  6892.829662: : NODE_write2 page =ea0001c49f40, pos  
=1000
  dd-1098  [000] d...  6892.829831: : NODE_write1 a_ops=81a20160, 
bytes=1000
  dd-1098  [000] d...  6892.829842: : NODE_write2 page =ea0001c49f80, pos  
=2000

Signed-off-by: He Kuang 
---
 samples/bpf/Makefile|  1 +
 samples/bpf/sample_bpf_fetch_args.c | 43 +
 2 files changed, 44 insertions(+)
 create mode 100644 samples/bpf/sample_bpf_fetch_args.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 8fdbd73..dc0b0e8 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -30,6 +30,7 @@ always += tracex2_kern.o
 always += tracex3_kern.o
 always += tracex4_kern.o
 always += tcbpf1_kern.o
+always += sample_bpf_fetch_args.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 
diff --git a/samples/bpf/sample_bpf_fetch_args.c 
b/samples/bpf/sample_bpf_fetch_args.c
new file mode 100644
index 000..9b587df
--- /dev/null
+++ b/samples/bpf/sample_bpf_fetch_args.c
@@ -0,0 +1,43 @@
+/*
+  Sample code for bpf_fetch_args().
+*/
+
+#include 
+#include 
+
+#include 
+#include 
+#include "bpf_helpers.h"
+
+SEC("generic_perform_write=generic_perform_write+122 file->f_mapping->a_ops 
bytes page pos")
+int NODE_generic_perform_write(struct pt_regs *ctx)
+{
+   struct param_s {
+   unsigned long a_ops;
+   unsigned long bytes;
+   unsigned long page;
+   unsigned long pos;
+   } param = {0};
+
+   bpf_fetch_args(ctx, );
+
+   /* actions */
+   {
+   /* 5 args max for bpf_trace_printk, print in 2 lines */
+   char fmt1[] = "NODE_write1 a_ops=%p, bytes=%p\n";
+   char fmt2[] = "NODE_write2 page =%p, pos  =%p\n";
+
+   bpf_trace_printk(fmt1, sizeof(fmt1),
+   param.a_ops,
+   param.bytes);
+
+   bpf_trace_printk(fmt2, sizeof(fmt2),
+   param.page,
+   param.pos);
+   }
+
+   return 1;
+}
+
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 3/5] bpf: Add helper function for fetching variables at probe point

2015-05-17 Thread He Kuang

This helper function uses kernel structure trace_probe and related fetch
functions for fetching variables described in 'SEC' to bpf stack.

Signed-off-by: He Kuang 
---
 include/uapi/linux/bpf.h  |  1 +
 kernel/trace/bpf_trace.c  | 38 ++
 samples/bpf/bpf_helpers.h |  2 ++
 3 files changed, 41 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a9ebdf5..b1a7685 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -210,6 +210,7 @@ enum bpf_func_id {
 * Return: 0 on success
 */
BPF_FUNC_l4_csum_replace,
+   BPF_FUNC_fetch_args,
__BPF_FUNC_MAX_ID,
 };
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 2d56ce5..ba601da 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include "trace.h"
+#include "trace_probe.h"
 
 static DEFINE_PER_CPU(int, bpf_prog_active);
 
@@ -159,6 +160,39 @@ static const struct bpf_func_proto bpf_trace_printk_proto 
= {
.arg2_type  = ARG_CONST_STACK_SIZE,
 };
 
+/* Store the value of each argument */
+static void
+bpf_store_trace_args(struct pt_regs *regs, struct trace_probe *tp,
+   u8 *data)
+{
+   int i;
+
+   for (i = 0; i < tp->nr_args; i++) {
+   /* Just fetching data normally */
+   call_fetch(>args[i].fetch, regs,
+   data + tp->args[i].offset);
+   }
+}
+
+static u64 bpf_fetch_args(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+   struct pt_regs *regs = (struct pt_regs *)(long)r1;
+   struct trace_probe *tp = ((struct bpf_pt_regs *)regs)->tp;
+   void *data = (void *)(long)r2;
+
+   bpf_store_trace_args(regs, tp, data);
+
+   return 0;
+}
+
+static const struct bpf_func_proto bpf_fetch_args_proto = {
+   .func   = bpf_fetch_args,
+   .gpl_only   = true,
+   .ret_type   = RET_INTEGER,
+   .arg1_type  = ARG_PTR_TO_CTX,
+   .arg2_type  = ARG_PTR_TO_STACK,
+};
+
 static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id 
func_id)
 {
switch (func_id) {
@@ -181,6 +215,10 @@ static const struct bpf_func_proto 
*kprobe_prog_func_proto(enum bpf_func_id func
trace_printk_init_buffers();
 
return _trace_printk_proto;
+
+   case BPF_FUNC_fetch_args:
+   return _fetch_args_proto;
+
default:
return NULL;
}
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index f960b5f..578a8e3 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -21,6 +21,8 @@ static unsigned long long (*bpf_ktime_get_ns)(void) =
(void *) BPF_FUNC_ktime_get_ns;
 static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
(void *) BPF_FUNC_trace_printk;
+static int (*bpf_fetch_args)(void *ctx, void *data) =
+   (void *) BPF_FUNC_fetch_args;
 
 /* llvm builtin functions that eBPF C program may use to
  * emit BPF_LD_ABS and BPF_LD_IND instructions
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/5] bpf: Pass trace_probe to bpf_prog for variable fetching

2015-05-17 Thread He Kuang

Add new structure bpf_pt_regs, which contains both original
'ctx'(pt_regs) and trabe_probe pointer, and pass this new pointer to bpf
prog for variable fetching.

Signed-off-by: He Kuang 
---
 kernel/trace/trace_kprobe.c | 11 +--
 kernel/trace/trace_probe.h  |  5 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index d0ce590..cee0b28 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1141,8 +1141,15 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs 
*regs)
int size, __size, dsize;
int rctx;
 
-   if (prog && !trace_call_bpf(prog, regs))
-   return;
+   if (prog) {
+   struct bpf_pt_regs bpf_pt_regs;
+
+   bpf_pt_regs.pt_regs = *regs;
+   bpf_pt_regs.tp = >tp;
+
+   if (!trace_call_bpf(prog, _pt_regs))
+   return;
+   }
 
head = this_cpu_ptr(call->perf_events);
if (hlist_empty(head))
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index ab283e1..5b1f12c 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -391,4 +391,9 @@ store_trace_args(int ent_size, struct trace_probe *tp, 
struct pt_regs *regs,
}
 }
 
+struct bpf_pt_regs {
+   struct pt_regs pt_regs;
+   struct trace_probe *tp;
+};
+
 extern int set_print_fmt(struct trace_probe *tp, bool is_return);
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 1/5] perf bpf: Add -k option for testing convenience

2015-05-17 Thread He Kuang

Add -k option to perf bpf command.

Signed-off-by: He Kuang 
---
 tools/perf/builtin-bpf.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-bpf.c b/tools/perf/builtin-bpf.c
index 4ef294a..9ea34b3 100644
--- a/tools/perf/builtin-bpf.c
+++ b/tools/perf/builtin-bpf.c
@@ -11,6 +11,7 @@
 #include "builtin.h"
 #include "perf.h"
 #include "debug.h"
+#include "util/symbol.h"
 #include "parse-options.h"
 #include "bpf-loader.h"
 
@@ -30,6 +31,8 @@ static struct bpf_cmd bpf_cmds[];
 struct option bpf_options[] = {
OPT_INCR('v', "verbose", , "be more verbose "
   "(show debug information)"),
+   OPT_STRING('k', "vmlinux", _conf.vmlinux_name,
+  "file", "vmlinux pathname"),
OPT_END()
 };
 
-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 0/5] Fetching local variables for bpf prog

2015-05-17 Thread He Kuang

This patch is based on https://lkml.org/lkml/2015/5/17/84 (perf tools:
introduce 'perf bpf' command to load eBPF programs).

Previous discusions on perf bpf: Probing with local variable:
https://lkml.org/lkml/2015/5/5/260. In that patch, we tried to
generate a bpf bytecode prologue in perf, this prologue fetches and
places variables as bpf function parameters, for making it easier to
fetch variables in bpf prog.

Alexei's comments:

 - Argument limitation is <=3, which is OK but should be documented.
 - Support it without debug info when kprobe is placed at the top
   of the function.
 - Concise the 'config' section.

Masami has metioned:

 - The redundant functionality of both userspace and kernel variable
   parsing.
 - The possibility of replacing the old fetch_arg functions with these
   byte code

I've made a new version of userspace prologue which fixes the problems
in that RFC series(not sent yet), but when trying to resolve Alexei's
2nd suggestion, we found it is in contradiction to the argument number
limitation. By a rough statistics, there're 13.5 percent fucntions
have 4 or more arguments in kernel. BPF calling convention limits the
maximum number of argument number to 5(R1~R5), besides the R1 for
'ctx', there're 4 registers left for arguments passing. It is not
reasonable to pass the first 4 arguments when probing a function which
has more than 4 arguments.

Consider Masami's suggestion to do the work in kernel, we found that
adding a helper proto-type function for fetching bpf variables is a
more easier way to reach our goals. Embed trace_probe pointer to 'ctx'
for bpf prog, then we can use the existing code for fetching args in
kernel. Just like the 2nd suggestion, but here we do not generate any
bytecode, but use the existing call_fetch() results directly. Example
code can be found in [RPF PATCH 5/5].

Moreover, this method removes the argument number limitation caused by
bpf calling convention(R2-R5 for placing variables). And leaves the
users free to decide whether or not do the arguments/variables
fetching. They can use this helper function in their own conditions.

Also need to note:

 - We can generate a syntax sugar which can convert the 'structure
   param' to function args, this can reduce the users' extra work.
 - An extra verification needs to be implemented to be sure that user
   provides enough space for arguments fetching.

This method's pros & cons:

pros:
 - Remove arugment number limitation. 
 - User free to choose whether or not do the fetch and decide where to
   execute the fetch.
 - Remove kernel/userspace redundant functionality of parsing args.

cons:
 - User should add the 'structure param' code themselves.

Looking forward for disscusions.

He Kuang (5):
  perf bpf: Add -k option for testing convenience
  bpf: Pass trace_probe to bpf_prog for variable fetching
  bpf: Add helper function for fetching variables at probe point
  samples/bpf: Add proper prefix to objects in Makefile
  samples/bpf: Add sample for testing bpf fetch args

 include/uapi/linux/bpf.h|  1 +
 kernel/trace/bpf_trace.c| 38 
 kernel/trace/trace_kprobe.c | 11 --
 kernel/trace/trace_probe.h  |  5 +
 samples/bpf/Makefile|  3 ++-
 samples/bpf/bpf_helpers.h   |  2 ++
 samples/bpf/sample_bpf_fetch_args.c | 43 +
 tools/perf/builtin-bpf.c|  3 +++
 8 files changed, 103 insertions(+), 3 deletions(-)
 create mode 100644 samples/bpf/sample_bpf_fetch_args.c

-- 
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-sunxi] [RFC 0/7] ARM: sun9i: SMP support with Multi-Cluster Power Management

2015-05-17 Thread Nicolas Pitre

On Sun, 17 May 2015, Maxime Ripard wrote:

> Hi Ian,
> 
> On Sat, May 16, 2015 at 11:08:46AM +0100, Ian Campbell wrote:
> > On Thu, 2015-05-14 at 14:10 +0800, Chen-Yu Tsai wrote:
> > > This is my attempt to support SMP and CPU hot plugging on the Allwinner
> > > A80 SoC. The A80 is a big.Little processor with 2 clusters of 4x Cortex-A7
> > > and 4x Cortex-A15 cores.
> > 
> > I thought there was a preference these days to support this sort of
> > thing via support PSCI in the firmware, which allows for other things
> > such as non-secure-world etc.
> 
> Yes, it is the preferred way. Meaning that if someone wants to do that
> work, he's very much welcome and encouraged to do so. But if no one's
> doing it, then we still have to have a way to bringup the secondary
> CPUs.

And doing so in the kernel (at least initially) is simpler, and so much 
easier to fix when it is broken.  We've seen a few systems already where 
power management is crippled because no one is able/allowed/willing to 
fix the broken firmware.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: suspend regression in 4.1-rc1

2015-05-17 Thread Omar Sandoval

On Sun, May 17, 2015 at 08:50:41PM +0200, Michal Hocko wrote:
> Hi,
> s2ram broke after 4.1-rc1 for me. The second s2ram simply doesn't wake
> up (fans turn on but the screen is off). I have even noticed fans
> starting also while suspended in some instances (which was especially
> annoying when it happened on the way home from work).
> I've tried /sys/power/pm_test and the issue starts at processors mode.
> Nothing really interesting shows up in the netconsole but I didn't get
> to a more detailed testing there.
> 
> I've tried to bisect this as 4.0 works reliably. This was tricky though
> because the first bad commit is a merge:
> 
> commit 1dcf58d6e6e6eb7ec10e9abc56887b040205b06f
> Merge: 80dcc31fbe55 e4b0db72be24
> Author: Linus Torvalds 
> Date:   Tue Apr 14 16:49:17 2015 -0700
> 
> Merge branch 'akpm' (patches from Andrew)
> 
> The merge commit is empty and both 80dcc31fbe55 and e4b0db72be24 work
> properly but the merge is bad. So it seems like some of the commits in
> either branch has a side effect which needs other branch in order to
> reproduce.
> 
> So've tried to bisect ^80dcc31fbe55 e4b0db72be24 and merged 80dcc31fbe55
> in each step. This lead to:
> 
> commit 195daf665a6299de98a4da3843fed2dd9de19d3a
> Author: Ulrich Obergfell 
> Date:   Tue Apr 14 15:44:13 2015 -0700
> 
> watchdog: enable the new user interface of the watchdog mechanism
> 
> The patch doesn't revert because of follow up changes so I have reverted
> all three:
> 692297d8f968 ("watchdog: introduce the hardlockup_detector_disable() 
> function")
> b2f57c3a0df9 ("watchdog: clean up some function names and arguments")
> 195daf665a62 ("watchdog: enable the new user interface of the watchdog 
> mechanism")
> 
> on top of my current Linus tree (4cfceaf0c087f47033f5e61a801f4136d6fb68c6)
> and the issue is gone. I have hard time to understand what these 3 could have
> to do with suspend path, though.
> 
> Then I've tried to bisect the other branch and merge 195daf665a62 during
> each step to find out which patch starts failing. This lead to an even
> weirder commit a1e12da4796a ("perf tools: Add 'I' event modifier for
> exclude_idle bit") but maybe I've just screwed something on the way.
> 
> I will continue debugging tomorrow but any hints would be helpful.
> -- 
> Michal Hocko
> SUSE Labs

The symptoms here are different, and the bisect indicates that it's
completely unrelated, but just for kicks you could also try:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=533445c6e53368569e50ab3fb712230c03d523f3
-- 
Omar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v1 03/11] genirq: Use CONFIG_NUMA instead of CONFIG_SMP to guard irq_common_data.node

2015-05-17 Thread Jiang Liu

On 2015/5/16 4:44, Thomas Gleixner wrote:
> On Mon, 4 May 2015, Jiang Liu wrote:
> 
>> NUMA is enabled by CONFIG_NUMA instead of CONFIG_SMP, so use CONFIG_NUMA
>> to guard irq_common_data.node.
> 
> Please move this change to the front and do it on irq_data. That would
> have avoided confusing Abel :)
Hi Thomas,
Reordering these two patches will cause too much unnecessary code
changes. so I will merge this patch into the patch to move node from
struct irq_data into struct irq_common_data, which is more
straight forward.
Thanks!
Gerry

> 
> Thanks,
> 
>   tglx
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[EDT] [PATCH] smack: allow mount opts setting over filesystems with binary mount data

2015-05-17 Thread VIVEK TRIVEDI

EP-79F327334E67427298A99177E8F035A1

Add support for setting smack mount labels(using smackfsdef, smackfsroot,
smackfshat, smackfsfloor, smackfstransmute) for filesystems with binary
mount data like NFS.

To achieve this, implement sb_parse_opts_str and sb_set_mnt_opts security
operations in smack LSM similar to SELinux.

Signed-off-by: Vivek Trivedi 
Signed-off-by: Amit Sahrawat 
---
 security/smack/smack.h |   18 
 security/smack/smack_lsm.c |  250 
 2 files changed, 223 insertions(+), 45 deletions(-)

diff --git a/security/smack/smack.h b/security/smack/smack.h
index b8c1a86..f5db743 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -138,6 +138,24 @@ struct smk_port_label {
struct smack_known  *smk_out;   /* outgoing label */
 };
 
+/* Super block security struct flags for mount options */
+#define FSDEFAULT_MNT  0x01
+#define FSFLOOR_MNT0x02
+#define FSHAT_MNT  0x04
+#define FSROOT_MNT 0x08
+#define FSTRANS_MNT0x10
+
+#define NUM_SMK_MNT_OPTS   5
+
+enum {
+   Opt_error = -1,
+   Opt_fsdefault = 1,
+   Opt_fsfloor = 2,
+   Opt_fshat = 3,
+   Opt_fsroot = 4,
+   Opt_fstransmute = 5,
+};
+
 /*
  * Mount options
  */
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 5eae42c..d88c27e 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "smack.h"
 
 #define TRANS_TRUE "TRUE"
@@ -64,6 +65,15 @@ static char *smk_bu_mess[] = {
"Unconfined Object",/* SMACK_UNCONFINED_OBJECT */
 };
 
+static const match_table_t tokens = {
+   {Opt_fsdefault, SMK_FSDEFAULT "%s"},
+   {Opt_fsfloor, SMK_FSFLOOR "%s"},
+   {Opt_fshat, SMK_FSHAT "%s"},
+   {Opt_fsroot, SMK_FSROOT "%s"},
+   {Opt_fstransmute, SMK_FSTRANS "%s"},
+   {Opt_error, NULL},
+};
+
 static void smk_bu_mode(int mode, char *s)
 {
int i = 0;
@@ -573,72 +583,188 @@ static int smack_sb_copy_data(char *orig, char 
*smackopts)
 }
 
 /**
- * smack_sb_kern_mount - Smack specific mount processing
+ * smack_parse_opts_str - parse Smack specific mount options
+ * @options: mount options string
+ * @opts: where to store converted mount opts
+ *
+ * Returns 0 on success or -ENOMEM on error.
+ *
+ * converts Smack specific mount options to generic security option format
+ */
+static int smack_parse_opts_str(char *options,
+   struct security_mnt_opts *opts)
+{
+   char *p;
+   char *fsdefault = NULL, *fsfloor = NULL;
+   char *fshat = NULL, *fsroot = NULL, *fstransmute = NULL;
+   int rc = -ENOMEM, num_mnt_opts = 0;
+
+   opts->num_mnt_opts = 0;
+
+   if (!options)
+   return 0;
+
+   while ((p = strsep(, ",")) != NULL) {
+   int token;
+   substring_t args[MAX_OPT_ARGS];
+
+   if (!*p)
+   continue;
+
+   token = match_token(p, tokens, args);
+
+   switch (token) {
+   case Opt_fsdefault:
+   if (fsdefault)
+   goto out_opt_err;
+   fsdefault = match_strdup([0]);
+   if (!fsdefault)
+   goto out_err;
+   break;
+   case Opt_fsfloor:
+   if (fsfloor)
+   goto out_opt_err;
+   fsfloor = match_strdup([0]);
+   if (!fsfloor)
+   goto out_err;
+   break;
+   case Opt_fshat:
+   if (fshat)
+   goto out_opt_err;
+   fshat = match_strdup([0]);
+   if (!fshat)
+   goto out_err;
+   break;
+   case Opt_fsroot:
+   if (fsroot)
+   goto out_opt_err;
+   fsroot = match_strdup([0]);
+   if (!fsroot)
+   goto out_err;
+   break;
+   case Opt_fstransmute:
+   if (fstransmute)
+   goto out_opt_err;
+   fstransmute = match_strdup([0]);
+   if (!fstransmute)
+   goto out_err;
+   break;
+   default:
+   rc = -EINVAL;
+   pr_warn("SMACK:  unknown mount option\n");
+   goto out_err;
+   }
+   }
+
+   opts->mnt_opts = kcalloc(NUM_SMK_MNT_OPTS, sizeof(char *), GFP_ATOMIC);
+   if (!opts->mnt_opts)
+   goto out_err;
+
+   opts->mnt_opts_flags = kcalloc(NUM_SMK_MNT_OPTS, sizeof(int),
+   GFP_ATOMIC);
+   if

Re: [PATCH V3 09/13] selftests, powerpc: Add test for DSCR value inheritence across fork

2015-05-17 Thread Anton Blanchard

Hi Anshuman,

Thanks for getting these testcases into the kernel.

> This patch adds a test to verify that the changed DSCR value inside
> any process would be inherited to it's child process across the fork
> system call.

One issue I do notice (a bug in my original test cases too), is that we
don't restore the DSCR on exit. I'm not sure we need to go to the
trouble of saving and restoring it, but we should at least get it back
to 0 when done.

Also a tiny nit, no need for a newline in perror():

open() failed
: Permission denied

With those changes you can add:

Signed-off-by: Anton Blanchard 

to the patches based on my testcases.

Anton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] JFFS2: Less locking when reading directory entries

2015-05-17 Thread Mark Tomlinson

At startup, reading a directory (for example to do an ls), requires finding
information on every file. To support JFFS2 recovery from partially written
blocks, the JFFS2 driver must scan all data blocks to verify the checksums
are correct just to be able to correctly report the length of a file. This
will take some time, and will be dependent on the amount of data on the
filesystem.

What makes this worse is that any path lookup will lock the dentry cache
to add the new entry. The JFFS2 driver then spends time finding the file
information (reading the entire file), before it returns the new dentry
information allowing the cache to be unlocked. During this time, no other
files in the same directory can be opened or even tested for existence.

However, there is no need for the dentry cache to be locked for the scan of
the file. The JFFS2 driver already locks the file, so the file will not be
deleted or modified. It also ensures that if another process tries to scan
the same file, the second process will be blocked and the scan only proceed
once.

To make the scan occur without locking the cache, a new vfs call has been
added which allows a filesystem to scan the file, but not return anything.
When the lookup occurs after this, the JFFS2 driver will find this
information and can quickly return the filled-in dentry.

Signed-off-by: Mark Tomlinson 
---
 fs/jffs2/dir.c | 41 +--
 fs/namei.c | 63 +++---
 include/linux/fs.h |  1 +
 3 files changed, 85 insertions(+), 20 deletions(-)

diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c
index 1ba5c97..69c0ec4 100644
--- a/fs/jffs2/dir.c
+++ b/fs/jffs2/dir.c
@@ -36,6 +36,7 @@ static int jffs2_rmdir (struct inode *,struct dentry *);
 static int jffs2_mknod (struct inode *,struct dentry *,umode_t,dev_t);
 static int jffs2_rename (struct inode *, struct dentry *,
 struct inode *, struct dentry *);
+static void jffs2_prescan(struct inode *dir_i, struct qstr *d_name);
 
 const struct file_operations jffs2_dir_operations =
 {
@@ -51,6 +52,7 @@ const struct inode_operations jffs2_dir_inode_operations =
 {
.create =   jffs2_create,
.lookup =   jffs2_lookup,
+   .prescan =  jffs2_prescan,
.link = jffs2_link,
.unlink =   jffs2_unlink,
.symlink =  jffs2_symlink,
@@ -74,8 +76,12 @@ const struct inode_operations jffs2_dir_inode_operations =
and we use the same hash function as the dentries. Makes this
nice and simple
 */
-static struct dentry *jffs2_lookup(struct inode *dir_i, struct dentry *target,
-  unsigned int flags)
+/* The prescan function does not have a dentry to fill in, so create this 
common function
+ * which is just passed the name and the inode for the directory.
+ * This function is very similar to the original jffs2_lookup, except for the 
arguments
+ * and the fact that the dentry (now not passed) is not updated.
+ */
+static struct inode *jffs2_lookup_common(struct inode *dir_i, struct qstr 
*d_name)
 {
struct jffs2_inode_info *dir_f;
struct jffs2_full_dirent *fd = NULL, *fd_list;
@@ -84,7 +90,7 @@ static struct dentry *jffs2_lookup(struct inode *dir_i, 
struct dentry *target,
 
jffs2_dbg(1, "jffs2_lookup()\n");
 
-   if (target->d_name.len > JFFS2_MAX_NAME_LEN)
+   if (d_name->len > JFFS2_MAX_NAME_LEN)
return ERR_PTR(-ENAMETOOLONG);
 
dir_f = JFFS2_INODE_INFO(dir_i);
@@ -92,11 +98,11 @@ static struct dentry *jffs2_lookup(struct inode *dir_i, 
struct dentry *target,
mutex_lock(_f->sem);
 
/* NB: The 2.2 backport will need to explicitly check for '.' and '..' 
here */
-   for (fd_list = dir_f->dents; fd_list && fd_list->nhash <= 
target->d_name.hash; fd_list = fd_list->next) {
-   if (fd_list->nhash == target->d_name.hash &&
+   for (fd_list = dir_f->dents; fd_list && fd_list->nhash <= d_name->hash; 
fd_list = fd_list->next) {
+   if (fd_list->nhash == d_name->hash &&
(!fd || fd_list->version > fd->version) &&
-   strlen(fd_list->name) == target->d_name.len &&
-   !strncmp(fd_list->name, target->d_name.name, 
target->d_name.len)) {
+   strlen(fd_list->name) == d_name->len &&
+   !strncmp(fd_list->name, d_name->name, d_name->len)) {
fd = fd_list;
}
}
@@ -108,6 +114,27 @@ static struct dentry *jffs2_lookup(struct inode *dir_i, 
struct dentry *target,
if (IS_ERR(inode))
pr_warn("iget() failed for ino #%u\n", ino);
}
+   return inode;
+}
+
+/* Fill in an inode, and store the information in cache. This allows a
+ * subsequent jffs2_lookup() call to proceed quickly, which is useful
+ * since the jffs2_lookup() call will have the directory entry cache
+ * locked.
+ */

[PATCH 0/1] JFFS2: Less locking when reading directory entries

2015-05-17 Thread Mark Tomlinson

I have posted this before, but have extended the patch into a few more
functions. The intent of the code is as before -- to improve JFFS2 lookups
by not locking i_mutex for long periods when files are not in cache. For
our embedded environment, we see a *five second* improvement in boot time.

This patch is an attempt to improve the speed of JFFS2 at startup. Our
particular problem is that we have a 30MB file on NOR flash which takes
about five seconds to read and test the data CRCs. During this time access
to other files in the same directory is blocked, due to
parent->d_inode->i_mutex being locked.

This patch solves this problem by adding a 'pre-lookup' call down to JFFS2,
which can be called without this mutex held. When the actual lookup is
performed, the results are in JFFS2's cache, and the dentry can be filled
in quickly.

However, given that I do not have experience at Linux filesystem code,
I can't be sure that this is a correct solution, or that there isn't a
better way of achieving what I'm trying to do. I feel there must be a way
to do this without creating a new VFS function call.

I suspect other filesystems could benefit from this too, as a lot of them
call the same d_splice_alias() function to fill in the dentry. JFFS2
already seems to have all the lower-level locks that are needed for this to
work; I don't know if that's true in other filesystems which could be
relying on the directory's i_mutex being locked. Because JFFS2 needs to
walk the entire file, there are big gains to be made here; other filesystems
may gain little to nothing.

I'm not expecting that this patch will get applied as-is, but please let me
know if there is any merit to it, whether it should work, and what still
needs to be done to if this is to be made part of the kernel.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 3/5] pwm: kona: Fix incorrect config, disable, and polarity procedures

2015-05-17 Thread Tim Kryger

On Tue, May 12, 2015 at 4:28 PM, Jonathan Richardson
 wrote:

> The polarity procedure no longer applies the settings to change the
> output signal because it can't be called when the pwm is enabled anyway.
> The polarity is only updated in the control register. The correct
> polarity will be applied on enable. The old method of applying changes
> would result in no signal when the polarity was changed. The new
> apply_settings function would fix this problem but it isn't required
> anyway.

Thanks for incorporating some of my suggestions in your latest version.

I'm still concerned about delaying when polarity changes take effect.

Since backlight is a common use of PWM, consider the following situation.

backlight {
compatible = "pwm-backlight";
pwms = < 0 500 PWM_POLARITY_NORMAL>;
brightness-levels = <0 4 8 16 32 64 128 255>;
default-brightness-level = <0>;
};

The Kona PWM hardware starts in inversed mode so it will drive output high
once its clock is enabled during the probe.

Polarity is not adjusted during probe so it stays high and it registers with
the PWM core using the new pwmchip_add_inversed() function.

Next, the pwm-backlight driver probe executes and it calls devm_pwm_get()
which then calls pwm_set_period() and most importantly pwm_set_polarity().

The output would change to constant low at this point in the original driver
but with your proposed change it will remain high.

The driver sets bl->props.brightness and calls backlight_update_status() but,
since in this case the default brightness is zero, it assumes it doesn't need
to enable the PWM.

The backlight driver probe then returns and the PWM output is incorrect.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [EDT][PATCH 1/1] hw_breakpoint.c :cpu hotplug handling

2015-05-17 Thread Vaneet Narang

EP-2DAD0AFA905A4ACB804C4F82A001242F

On Wed, May 13, 2015 at 06:24:06AM +0100, Maninder Singh wrote:
>> EP-2DAD0AFA905A4ACB804C4F82A001242F
>> 
>> Subject: [PATCH 1/1] hw_breakpoint.c :cpu hotplug handling
>> 
>> This patch adds support for CPU hotplug, It re-installl all installed 
>> watchpoints and breakpoints
>> back on H/W in case of cpu-hot plug.

>Not sure why this is needed -- the scheduler should reinstall the
>breakpoints when the debugged task gets scheduled in via
>arch_install_hw_breakpoint.
>
>Will

I agree with you this reinstalling has to be either take care by scheduler or 
Debug tool. 
In current implementation we clear H/W registers but we don't clear slots 
(wp_on_reg / bp_on_reg) for both watchpoint or breakpoint.
So it makes mandatory for debug tool to uninstall breakpoints when CPU goes 
offline, because if we don't 
uninstall which I think is not required and when CPU comes online, we will not 
be able to reinstall them back because there will be no free slots.
Despite of the fact that H/W registers are free but still we wouldn't be able 
to install breakpoints.
Logically we should clear these slots or we should reinstall but if you think 
reinstall has to be taken care by scheduler or debugger
then at least we should clear these slots.

Regards,
Vaneet Narang

Re: [PATCH] alpha: Wire up missing syscalls

2015-05-17 Thread Chen Gang

On 05/13/2015 08:55 AM, Chen Gang wrote:
> On 05/12/2015 10:29 PM, Dave Jones wrote:
>> likewise sys_bpf judging by the absence of bpf_int_jit_compile and friends 
>> in arch/alpha
>> The weak symbols mean it probably compiles/links, but it doesn't actually do
>> anything, and now instead of -ENOSYS, anyone trying to actually use that 
>> syscall
>> on alpha will get weird results.
>>
>> Shutting up warnings like this strikes me as the wrong thing to do.
>>
> 
> It sounds reasonable. For me, we need to wire up implemented syscalls,
> and still left the unimplemented syscalls as building warnings.
> 
> If no any additional reply, I shall try to send patch v2 for it within
> this week (2015-05-17).
> 

Sorry for sending patch v2 late (I sent it today just now).

The reason is I could not login to hotmail and gmail server at home
during the last week end (although the network is OK for other website).


Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: perf.data file format specification draft

2015-05-17 Thread Namhyung Kim

Hi Arnaldo and Andi,

On Thu, May 14, 2015 at 10:11:29AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, May 14, 2015 at 02:25:17PM +0200, Andi Kleen escreveu:
> > Hi,
> > 
> > Since there are more and more consumers I started a description of the
> > on-disk perf.data format. This does not replace the kernel perf event
> > description or the manpage, but describes the parts that perf record
> > adds.
> > 
> > So far it is still has some gaps and needs review. Eventually this should
> > become part of the perf documentation.
> > 
> > Steven, would be good if you could fill in some details on how trace
> > data works. 
> 
> I guess that would be Frédéric, and also I think this is a good
> opportunity to remove some stuff that seem to be collected but unused,
> namely a kallsyms copy and maybe something else.

Let me try to describe.  The tracing_data_get() does the work and it
records the following for HEADER_TRACING_DATA in order:

 * tracing header data
   - file magic bytes (including someone's birthday :) )
   - file version (it's 0.5 - the only thing I can see in diff with trace-cmd
 (version 6) is "saved-cmdline" file data which is unnecessary for perf
   - byte order, size of long and page size of the system

 * tracing header files
   - $tracefs/events/header_page and $tracefs/events/header_event
   - describe ftrace raw buffer format which is unnecessary for perf unless
 it reads the raw buffer directly (like my ftrace integration work?)

 * ftrace event files
   - format file for each event in $tracefs/events/ftrace directory
   - this is same as below but precedes other events, not sure why it's needed

 * (normal) event files
   - format file for each tracepoint event

 * /proc/kallsyms
   - for kernel symbol resolution, unnecessary for perf

 * tracing printk formats
   - for trace_printk?  unnecessary for perf

The last two can go away at least.

Thanks,
Namhyung


> 
> > Adrian, would be good if you could fill in the missing bits for
> > auxtrace/itrace.
> > Everyone else, please review and add missing information.
> 
> Thanks for doing this work!
> 
> IIRC there is a presentation written by Jiri where parts of this is
> documented, lemme try to find it...
> 
> - Arnaldo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] alpha: kernel: osf_sys: Set 'kts.tv_nsec' only when 'tv' has effect

2015-05-17 Thread Chen Gang

The related warning:

CC  init/do_mounts.o
  arch/alpha/kernel/osf_sys.c: In function ‘SyS_osf_settimeofday’:
  arch/alpha/kernel/osf_sys.c:1028:14: warning: ‘kts.tv_nsec’ may be used 
uninitialized in this function [-Wmaybe-uninitialized]
kts.tv_nsec *= 1000;
^
  arch/alpha/kernel/osf_sys.c:1016:18: note: ‘kts’ was declared here
struct timespec kts;
^

Signed-off-by: Chen Gang 
---
 arch/alpha/kernel/osf_sys.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index e51f578..36dc91a 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1019,14 +1019,13 @@ SYSCALL_DEFINE2(osf_settimeofday, struct timeval32 
__user *, tv,
if (tv) {
if (get_tv32((struct timeval *), tv))
return -EFAULT;
+   kts.tv_nsec *= 1000;
}
if (tz) {
if (copy_from_user(, tz, sizeof(*tz)))
return -EFAULT;
}
 
-   kts.tv_nsec *= 1000;
-
return do_sys_settimeofday(tv ?  : NULL, tz ?  : NULL);
 }
 
-- 
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] Staging: comedi: fix line longer than 80 chars in cb_pcidas64.c

2015-05-17 Thread Sudip Mukherjee

On Sun, May 17, 2015 at 04:47:23PM +0200, Amaury Denoyelle wrote:
> This patch fixes coding style errors reported by checkpatch.pl for
> cb_pcidas64.c, about too long source code lines.
> 
> Signed-off-by: Amaury Denoyelle 
> ---

>  }
>  
> -/* adjusts the size of hardware fifo (which determines block size for dma 
> xfers) */
> +/* adjusts the size of hardware fifo
> + * (which determines block size for dma xfers) */

This is not the style for multi-line comments. Please check CodingStyle
in Documentation.

>  static int set_ai_fifo_size(struct comedi_device *dev, unsigned int 
> num_samples)
>  {
  
>  
> @@ -1987,8 +1990,8 @@ static unsigned int get_divisor(unsigned int ns, 
> unsigned int flags)
>  
>  /* utility function that rounds desired timing to an achievable time, and
>   * sets cmd members appropriately.
> - * adc paces conversions from master clock by dividing by (x + 3) where x is 
> 24 bit number
> - */
> + * adc paces conversions from master clock by dividing by (x + 3) where x is
> + * 24 bit number */
same here

and when you are sending just one patch, you do not need to mention
[Patch 1/1] in the subject. just mention [Patch]

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] alpha: Wire up all missing implemented syscalls

2015-05-17 Thread Chen Gang

And still left the missing unimplemented syscalls as warnings. The
related warnings for missing implemented syscalls:

CALLscripts/checksyscalls.sh
  :1241:2: warning: #warning syscall getrandom not implemented [-Wcpp]
  :1244:2: warning: #warning syscall memfd_create not implemented [-Wcpp]
  :1250:2: warning: #warning syscall execveat not implemented [-Wcpp]

Signed-off-by: Chen Gang 
---
 arch/alpha/include/asm/unistd.h  |2 +-
 arch/alpha/include/uapi/asm/unistd.h |3 +++
 arch/alpha/kernel/systbls.S  |3 +++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h
index c509d30..a56e608 100644
--- a/arch/alpha/include/asm/unistd.h
+++ b/arch/alpha/include/asm/unistd.h
@@ -3,7 +3,7 @@
 
 #include 
 
-#define NR_SYSCALLS511
+#define NR_SYSCALLS514
 
 #define __ARCH_WANT_OLD_READDIR
 #define __ARCH_WANT_STAT64
diff --git a/arch/alpha/include/uapi/asm/unistd.h 
b/arch/alpha/include/uapi/asm/unistd.h
index d214a035..aa33bf5 100644
--- a/arch/alpha/include/uapi/asm/unistd.h
+++ b/arch/alpha/include/uapi/asm/unistd.h
@@ -472,5 +472,8 @@
 #define __NR_sched_setattr 508
 #define __NR_sched_getattr 509
 #define __NR_renameat2 510
+#define __NR_getrandom 511
+#define __NR_memfd_create  512
+#define __NR_execveat  513
 
 #endif /* _UAPI_ALPHA_UNISTD_H */
diff --git a/arch/alpha/kernel/systbls.S b/arch/alpha/kernel/systbls.S
index 2478971..9b62e3f 100644
--- a/arch/alpha/kernel/systbls.S
+++ b/arch/alpha/kernel/systbls.S
@@ -529,6 +529,9 @@ sys_call_table:
.quad sys_sched_setattr
.quad sys_sched_getattr
.quad sys_renameat2 /* 510 */
+   .quad sys_getrandom
+   .quad sys_memfd_create
+   .quad sys_execveat
 
.size sys_call_table, . - sys_call_table
.type sys_call_table, @object
-- 
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: suspend regression in 4.1-rc1

2015-05-17 Thread Linus Torvalds

On Sun, May 17, 2015 at 11:50 AM, Michal Hocko  wrote:
>
> The merge commit is empty and both 80dcc31fbe55 and e4b0db72be24 work
> properly but the merge is bad. So it seems like some of the commits in
> either branch has a side effect which needs other branch in order to
> reproduce.
>
> So've tried to bisect ^80dcc31fbe55 e4b0db72be24 and merged 80dcc31fbe55
> in each step.

Good extra work! Thanks.

> This lead to:
>
> commit 195daf665a6299de98a4da3843fed2dd9de19d3a
> Author: Ulrich Obergfell 
> Date:   Tue Apr 14 15:44:13 2015 -0700
>
> watchdog: enable the new user interface of the watchdog mechanism
>
> The patch doesn't revert because of follow up changes so I have reverted
> all three:
> 692297d8f968 ("watchdog: introduce the hardlockup_detector_disable() 
> function")
> b2f57c3a0df9 ("watchdog: clean up some function names and arguments")
> 195daf665a62 ("watchdog: enable the new user interface of the watchdog 
> mechanism")

Hmm. I guess we should just revert those three then. Unless somebody
can see what the subtle interaction is.

Actually, looking closer, on the *other* side of the merge, the only
commit that looks like it might be conflicting is

b3738d293233 "watchdog: Add watchdog enable/disable all functions"

which is then used by

b37609c30e41 "perf/x86/intel: Make the HT bug workaround
conditional on HT enabled"

Does the problem go away if you revert *those* two commits instead?

At least that would tell is what the exact bad interaction is.

Adding Stephane (author of those watchdog/perf patches) to the Cc. And
PeterZ, who signed them off (Ingo also did, but was already on the
participants list).

Anybody see it?

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/10] crypto: omap-sham: Add support for omap3 devices

2015-05-17 Thread Herbert Xu

On Fri, May 15, 2015 at 11:19:33AM +0200, Pali Rohár wrote:
> On Saturday 28 February 2015 17:25:02 Pavel Machek wrote:
> > On Thu 2015-02-26 14:49:53, Pali Rohár wrote:
> > > omap3 support is same as omap2, just with different IO address (specified 
> > > in DT)
> > > 
> > > Signed-off-by: Pali Rohár 
> > 
> > Acked-by: Pavel Machek 
> > 
> > > @@ -1792,6 +1792,10 @@ static const struct of_device_id 
> > > omap_sham_of_match[] = {
> > >   .data   = _sham_pdata_omap2,
> > >   },
> > >   {
> > > + .compatible = "ti,omap3-sham",
> > > + .data   = _sham_pdata_omap2,
> > > + },
> > > + {
> > >   .compatible = "ti,omap4-sham",
> > >   .data   = _sham_pdata_omap4,
> > >   },
> > 
> 
> Herbert, this is second crypto patch in this series, can you apply it too?

Applied.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] crypto: md5: add MD5 initial vectors

2015-05-17 Thread Herbert Xu

On Sun, May 17, 2015 at 12:54:12PM +0200, LABBE Corentin wrote:
> This patch simply adds the MD5 IV in the md5 header.
> 
> Signed-off-by: LABBE Corentin 

All applied.  Thanks!
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Linus Torvalds

On Sun, May 17, 2015 at 8:42 PM, Al Viro  wrote:
>
> "Rest of the path" makes no sense, obviously.  "More of the path" (and _not_
> as a string, TYVM - we have those components in ->d_name.name of dentries we
> want revalidated [..])

For revalidate, yes we kind of have them as dentries. I say kind of,
because it may be that we're only revalidating a directory in the
middle, and the rest of the path will be all new lookups, and we don't
have that part as dentries at all. But nobody is going to care about
"revalidate" for those.

HOWEVER.

We haven't actually walked/parsed the rest of the pathname yet at that
point, and we generally probably shouldn't, since we don't know if the
filesystem really is going to care. Why do extra work that may not be
useful?

So if we really do want to do it, and some filesystem cares enough, I
think we actually should just pass it in as a string, and then have a
helper function to say "ok, filesystem, if you want to revalidate the
rest of the path, use this function to turn the hinting string into
more dentries".

Because I don't think it's worth doing up-front if it's not clear that
the filesystem is going to care.

For example, the /proc filesystem uses d_revalidate() to re-check the
pid entries. That does *not* mean that it wants the rest of the
dentries pre-parsed at all. So by all means give it a "const char
*hint" for the rest, but it's going to just ignore it anyway, so don't
wast parsing it as "these will be the following dentries we will ask
you to revalidate".

So I do think that "const char *hint" might be the right thing to pass
down if we really care about this sufficiently. Both to revalidate and
to lookup().

We currently have that "rest" in lookup_slow() as

nd->last.name + hashlen_len(nd->last.hash_len)

and we could just pass that down to lookup_dcache (which does
revalidate) and lookup_real() (which does the actual ->lookup).

But I guess we could just save off 'name' into a new field in
nameidata in link_path_walk() after we've removed the slashes? I don't
know if it matters. We can mask off the slashes again if/when people
want to use the hint, after all.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 00/17][request for stable 3.10 inclusion] x86/nmi: Print all cpu stacks from NMI safely

2015-05-17 Thread long.wanglong

On 2015/5/14 21:55, Steven Rostedt wrote:
> On Thu, 14 May 2015 11:34:47 +
> Wang Long  wrote:
> 
>> The patch 1-13 backport the "seq_buf" infrastructures. in detail, patch 1, 2
>> and 6 only backport "seq_buf" related code.
>>
> 
> Ah, so basically you just backported the seq_buf.c code without
> modifying the trace_seq code. That's a good approach. I don't have much
> time to look at these but I'll try to skim them to see if I find
> anything broken.
> 
> I may pull all of them into a test branch and run my tests to make sure
> they don't break anything else.
> 
> -- Steve
> 
Hi Steve,

Thank you for your review and test. Does your testcases run OK with this
series patches?

Best Regards
Wang Long
> 
>>  arch/x86/kernel/apic/hw_nmi.c |  86 +-
>>  include/linux/percpu.h|   4 +
>>  include/linux/printk.h|   2 +
>>  include/linux/seq_buf.h   | 136 
>>  kernel/printk.c   |  41 +++--
>>  lib/Makefile  |   2 +-
>>  lib/seq_buf.c | 359 
>> ++
>>  7 files changed, 617 insertions(+), 13 deletions(-)
>>  create mode 100644 include/linux/seq_buf.h
>>  create mode 100644 lib/seq_buf.c
>>
> 
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] serial: 8250_uniphier: add UniPhier serial driver

2015-05-17 Thread Masahiro Yamada

Add the driver for on-chip UART used on UniPhier SoCs.

This hardware is similar to 8250 with a slightly different register
mapping, so it should go into drivers/tty/serial/8250 directory.

Signed-off-by: Masahiro Yamada 
---

Changes in v2:
  - Drop unnecessary #include 
  - Sort includes in alphabetical order
  - Use devm_clk_get() rather than of_clk_get()
  - Delete unneeded clk_put() from uniphier_uart_remove callback
  - Delete unneeded IS_ERR_OR_NULL check from uniphier_uart_remove callback
  - Use UNIPHIER_UART_*_SHIFT instead of hard-coded shift values
  - Change the first argument type of uniphier_of_serial_setup()
from (struct platform_device *) to (struct device *) for code-cleanup.

 drivers/tty/serial/8250/8250_uniphier.c | 247 
 drivers/tty/serial/8250/Kconfig |   7 +
 drivers/tty/serial/8250/Makefile|   1 +
 3 files changed, 255 insertions(+)
 create mode 100644 drivers/tty/serial/8250/8250_uniphier.c

diff --git a/drivers/tty/serial/8250/8250_uniphier.c 
b/drivers/tty/serial/8250/8250_uniphier.c
new file mode 100644
index 000..aabb64b
--- /dev/null
+++ b/drivers/tty/serial/8250/8250_uniphier.c
@@ -0,0 +1,247 @@
+/*
+ * Copyright (C) 2015 Masahiro Yamada 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "8250.h"
+
+/* Most (but not all) of UniPhier UART devices have 64-depth FIFO. */
+#define UNIPHIER_UART_DEFAULT_FIFO_SIZE64
+
+#define UNIPHIER_UART_CHAR_FCR 3
+#define UNIPHIER_UART_CHAR_SHIFT   8   /* Character Register */
+#define UNIPHIER_UART_FCR_SHIFT0   /* FIFO Control Register */
+#define UNIPHIER_UART_LCR_MCR  4
+#define UNIPHIER_UART_LCR_SHIFT8   /* Line Control Register */
+#define UNIPHIER_UART_MCR_SHIFT0   /* Modem Control Register */
+#define UNIPHIER_UART_DLR  9   /* Divisor Latch Register */
+
+/*
+ * The register map is slightly different from that of 8250.
+ * IO callbacks must be overridden for correct access to FCR, LCR, and MCR.
+ */
+static unsigned int uniphier_serial_in(struct uart_port *p, int offset)
+{
+   int valshift = 0;
+
+   switch (offset) {
+   case UART_LCR:
+   offset = UNIPHIER_UART_LCR_MCR;
+   valshift = UNIPHIER_UART_LCR_SHIFT;
+   break;
+   case UART_MCR:
+   offset = UNIPHIER_UART_LCR_MCR;
+   valshift = UNIPHIER_UART_MCR_SHIFT;
+   break;
+   default:
+   break;
+   }
+
+   offset <<= p->regshift;
+
+   /*
+* The return value must be masked with 0xff because LCR and MCR reside
+* in the same register that must be accessed by 32-bit write/read.
+* 8 or 16 bit access to this hardware result in unexpected behavior.
+*/
+   return (readl(p->membase + offset) >> valshift) & 0xff;
+}
+
+static void uniphier_serial_out(struct uart_port *p, int offset, int value)
+{
+   int valshift = 0;
+   bool normal = false;
+
+   switch (offset) {
+   case UART_FCR:
+   offset = UNIPHIER_UART_CHAR_FCR;
+   valshift = UNIPHIER_UART_FCR_SHIFT;
+   break;
+   case UART_LCR:
+   offset = UNIPHIER_UART_LCR_MCR;
+   valshift = UNIPHIER_UART_LCR_SHIFT;
+   /* Divisor latch access bit does not exist. */
+   value &= ~(UART_LCR_DLAB << valshift);
+   break;
+   case UART_MCR:
+   offset = UNIPHIER_UART_LCR_MCR;
+   valshift = UNIPHIER_UART_MCR_SHIFT;
+   break;
+   default:
+   normal = true;
+   break;
+   }
+
+   offset <<= p->regshift;
+
+   if (normal) {
+   writel(value, p->membase + offset);
+   } else {
+   /* special case: two registers share the same address. */
+   u32 tmp = readl(p->membase + offset);
+
+   tmp &= ~(0xff << valshift);
+   tmp |= value << valshift;
+   writel(tmp, p->membase + offset);
+   }
+}
+
+/*
+ * This hardware does not have the divisor latch access bit.
+ * The divisor latch register exists at different address.
+ * Override dl_read/write callbacks.
+ */
+static int uniphier_serial_dl_read(struct uart_8250_port *up)
+{
+   return readl(up->port.membase + UNIPHIER_UART_DLR);
+}
+
+static void uniphier_serial_dl_write(struct uart_8250_port

[PATCH v2] clk: mediatek: Initialize clk_init_data

2015-05-17 Thread Ricky Liang

The variable init (struct clk_init_data) is allocated on the stack.
We weren't initializing the .flags field, so it contains random junk,
which can cause all kinds of interesting issues when the flags are
parsed by clk_register.

Signed-off-by: Ricky Liang 
---
 drivers/clk/mediatek/clk-pll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/mediatek/clk-pll.c b/drivers/clk/mediatek/clk-pll.c
index 66154ca..44409e9 100644
--- a/drivers/clk/mediatek/clk-pll.c
+++ b/drivers/clk/mediatek/clk-pll.c
@@ -268,7 +268,7 @@ static struct clk *mtk_clk_register_pll(const struct 
mtk_pll_data *data,
void __iomem *base)
 {
struct mtk_clk_pll *pll;
-   struct clk_init_data init;
+   struct clk_init_data init = {};
struct clk *clk;
const char *parent_name = "clk26m";
 
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Al Viro

On Sun, May 17, 2015 at 07:56:26PM -0700, Linus Torvalds wrote:

> > So for Al's example of revalidating multiple components at once, once the 
> > VFS
> > gets to a point in the path where  d_revalidate says "I need more time",
> > the VFS just passes the rest of the path to the filesystem.
> 
> That's bullshit,. for a very simple and basic reason: "the rest of the
> path" is not necessarily at all for your filesystem!
> 
> Really. There might be mount-points, there might be symlinks, there
> might be tons of stuff like that.

"Rest of the path" makes no sense, obviously.  "More of the path" (and _not_
as a string, TYVM - we have those components in ->d_name.name of dentries we
want revalidated, complete with hashes, so WTF redo that?) is fine, though,
because the caller knows exactly where the mountpoints are, so we know how far
we can go.

> Now, this is why I said we can do a "hint" style thing. Part of that
> "hint" issue is very very much that it has no semantic meaning. You
> can't screw it up, because if it turns out that the path component
> we're looking up is a symlink and we actually end up in some other
> filesystem, if you end up looking up the hint part, it just would
> never actually get used.

For revalidate "this used to be a symlink, now it's not" or vice versa
means simply "it's gone stale".  Which is fine - again, the caller knows
where in that chain the symlinks are (and they obviously terminate the
chain to be revalidated).

Anyway, it's a side issue; we _can_ use the capability to do multi-component
lookups and with link_path_walk()-related logics getting untangled, we might
be able to do just that without messing the code up.  It's very clearly not
a 4.2 fodder, though, and I'm not sure how high priority it is for 4.3,
simply because making the exclusion between lookups weaker seems to be
of more general interest.  And those two will definitely be stepping on the
same area, so it probably makes more sense to sort the parallel lookups out
first.  _IF_ we manage to get that by 4.3-rc, sure, continuation into
multicomponent lookups would start looking as 4.3 material.  Hell knows -
stranger things have happened...

What we really need is a coherent documentation on the whole pathname-related
machinery; I've some preliminary bits and pieces written, but it'll take
more work.  Hopefully I'll have something postable in a week or less...

Right now I can think of 4 or 5 people familiar with the area, myself
included.  And we need more - it shouldn't be a fucking black magic, since
fs folks really ought to understand what's going on there.  The thing is,
under the assorted layers of cruft it's simpler than e.g. VMA handling...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] pwm: add Mediatek display PWM driver support

2015-05-17 Thread Daniel Kurtz

On Mon, May 11, 2015 at 5:26 PM, YH Huang  wrote:
> Add display PWM driver support to modify backlight for MT8173/MT6595.
>
> Signed-off-by: YH Huang 
> ---
>  drivers/pwm/Kconfig |   9 ++
>  drivers/pwm/Makefile|   1 +
>  drivers/pwm/pwm-disp-mediatek.c | 225 
> 
>  3 files changed, 235 insertions(+)
>  create mode 100644 drivers/pwm/pwm-disp-mediatek.c
>
> diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> index b1541f4..9edbb5a 100644
> --- a/drivers/pwm/Kconfig
> +++ b/drivers/pwm/Kconfig
> @@ -111,6 +111,15 @@ config PWM_CLPS711X
>   To compile this driver as a module, choose M here: the module
>   will be called pwm-clps711x.
>
> +config PWM_DISP_MEDIATEK
> +   tristate "MEDIATEK display PWM driver"
> +   depends on OF
> +   help
> + Generic PWM framework driver for mediatek disp-pwm device.
> +
> + To compile this driver as a module, choose M here: the module
> + will be called pwm-disp-mediatek.
> +
>  config PWM_EP93XX
> tristate "Cirrus Logic EP93xx PWM support"
> depends on ARCH_EP93XX
> diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> index ec50eb5..c5ff72a 100644
> --- a/drivers/pwm/Makefile
> +++ b/drivers/pwm/Makefile
> @@ -8,6 +8,7 @@ obj-$(CONFIG_PWM_BCM_KONA)  += pwm-bcm-kona.o
>  obj-$(CONFIG_PWM_BCM2835)  += pwm-bcm2835.o
>  obj-$(CONFIG_PWM_BFIN) += pwm-bfin.o
>  obj-$(CONFIG_PWM_CLPS711X) += pwm-clps711x.o
> +obj-$(CONFIG_PWM_DISP_MEDIATEK)+= pwm-disp-mediatek.o
>  obj-$(CONFIG_PWM_EP93XX)   += pwm-ep93xx.o
>  obj-$(CONFIG_PWM_FSL_FTM)  += pwm-fsl-ftm.o
>  obj-$(CONFIG_PWM_IMG)  += pwm-img.o
> diff --git a/drivers/pwm/pwm-disp-mediatek.c b/drivers/pwm/pwm-disp-mediatek.c
> new file mode 100644
> index 000..38293af
> --- /dev/null
> +++ b/drivers/pwm/pwm-disp-mediatek.c
> @@ -0,0 +1,225 @@
> +/*
> + * Mediatek display pulse-width-modulation controller driver.
> + * Copyright (c) 2015 MediaTek Inc.
> + * Author: YH Huang 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DISP_PWM_EN_OFF(0x0)
> +#define PWM_ENABLE_SHIFT   (0x0)
> +#define PWM_ENABLE_MASK(0x1 << PWM_ENABLE_SHIFT)
> +
> +#define DISP_PWM_COMMIT_OFF(0x08)
> +#define PWM_COMMIT_SHIFT   (0x0)
> +#define PWM_COMMIT_MASK(0x1 << PWM_COMMIT_SHIFT)
> +
> +#define DISP_PWM_CON_0_OFF (0x10)
> +#define PWM_CLKDIV_SHIFT   (0x10)
> +#define PWM_CLKDIV_MASK(0x3ff << PWM_CLKDIV_SHIFT)
> +#define PWM_CLKDIV_MAX (0x03ff)
> +
> +#define DISP_PWM_CON_1_OFF (0x14)
> +#define PWM_PERIOD_SHIFT   (0x0)
> +#define PWM_PERIOD_MASK(0xfff << PWM_PERIOD_SHIFT)
> +#define PWM_PERIOD_MAX (0x0fff)
> +/* Shift log2(PWM_PERIOD_MAX + 1) as divisor */
> +#define PWM_PERIOD_BIT_SHIFT   12
> +
> +#define PWM_HIGH_WIDTH_SHIFT   (0x10)
> +#define PWM_HIGH_WIDTH_MASK(0x1fff << PWM_HIGH_WIDTH_SHIFT)
> +
> +#define NUM_PWM 1
> +
> +struct mtk_disp_pwm_chip {
> +   struct pwm_chip chip;
> +   struct device   *dev;
> +   struct clk  *clk_main;
> +   struct clk  *clk_mm;
> +   void __iomem*mmio_base;
> +};
> +
> +static void mtk_disp_pwm_setting(void __iomem *address, u32 value, u32 mask)
> +{
> +   u32 val;
> +
> +   val = readl(address);
> +   val &= ~mask;
> +   val |= value;
> +   writel(val, address);
> +}
> +
> +static int mtk_disp_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
> +  int duty_ns, int period_ns)
> +{
> +   struct mtk_disp_pwm_chip *mpc;
> +   u64 div, rate;
> +   u32 clk_div, period, high_width, rem;
> +
> +   /*
> +* Find period, high_width and clk_div to suit duty_ns and period_ns.
> +* Calculate proper div value to keep period value in the bound.
> +*
> +* period_ns = 10^9 * (clk_div + 1) * (period +1) / PWM_CLK_RATE
> +* duty_ns = 10^9 * (clk_div + 1) * (high_width + 1) / PWM_CLK_RATE
> +*
> +* period = (PWM_CLK_RATE * period_ns) / (10^9 * (clk_div + 1)) - 1
> +* high_width = (PWM_CLK_RATE * duty_ns) / (10^9 * (clk_div + 1)) - 1
> +*/
> +   mpc = container_of(chip,

linux-next: manual merge of the net-next tree with the net tree

2015-05-17 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in
net/switchdev/switchdev.c between commit eea39946a1f3 ("rename
RTNH_F_EXTERNAL to RTNH_F_OFFLOAD") from the net tree and various
commits from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc net/switchdev/switchdev.c
index 055453d48668,0409f9b5bdbc..
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@@ -328,18 -670,13 +670,13 @@@ int switchdev_fib_ipv4_add(u32 dst, in
if (fi->fib_net->ipv4.fib_offload_disabled)
return 0;
  
-   dev = netdev_switch_get_dev_by_nhs(fi);
+   dev = switchdev_get_dev_by_nhs(fi);
if (!dev)
return 0;
-   ops = dev->swdev_ops;
- 
-   if (ops->swdev_fib_ipv4_add) {
-   err = ops->swdev_fib_ipv4_add(dev, htonl(dst), dst_len,
- fi, tos, type, nlflags,
- tb_id);
-   if (!err)
-   fi->fib_flags |= RTNH_F_OFFLOAD;
-   }
+ 
+   err = switchdev_port_obj_add(dev, _obj);
+   if (!err)
 -  fi->fib_flags |= RTNH_F_EXTERNAL;
++  fi->fib_flags |= RTNH_F_OFFLOAD;
  
return err;
  }
@@@ -357,27 -694,34 +694,34 @@@ EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_ad
   *
   *Delete IPv4 route entry from switch device.
   */
- int netdev_switch_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
-  u8 tos, u8 type, u32 tb_id)
+ int switchdev_fib_ipv4_del(u32 dst, int dst_len, struct fib_info *fi,
+  u8 tos, u8 type, u32 tb_id)
  {
+   struct switchdev_obj fib_obj = {
+   .id = SWITCHDEV_OBJ_IPV4_FIB,
+   .u.ipv4_fib = {
+   .dst = dst,
+   .dst_len = dst_len,
+   .fi = fi,
+   .tos = tos,
+   .type = type,
+   .nlflags = 0,
+   .tb_id = tb_id,
+   },
+   };
struct net_device *dev;
-   const struct swdev_ops *ops;
int err = 0;
  
 -  if (!(fi->fib_flags & RTNH_F_EXTERNAL))
 +  if (!(fi->fib_flags & RTNH_F_OFFLOAD))
return 0;
  
-   dev = netdev_switch_get_dev_by_nhs(fi);
+   dev = switchdev_get_dev_by_nhs(fi);
if (!dev)
return 0;
-   ops = dev->swdev_ops;
  
-   if (ops->swdev_fib_ipv4_del) {
-   err = ops->swdev_fib_ipv4_del(dev, htonl(dst), dst_len,
- fi, tos, type, tb_id);
-   if (!err)
-   fi->fib_flags &= ~RTNH_F_OFFLOAD;
-   }
+   err = switchdev_port_obj_del(dev, _obj);
+   if (!err)
 -  fi->fib_flags &= ~RTNH_F_EXTERNAL;
++  fi->fib_flags &= ~RTNH_F_OFFLOAD;
  
return err;
  }


pgpbUdxniAX5M.pgp
Description: OpenPGP digital signature

Re: [PATCH net-next,1/1] hv_netvsc: change member name of struct netvsc_stats

2015-05-17 Thread David Miller

From: Simon Xiao 
Date: Fri, 15 May 2015 02:33:03 -0700

> Currently the struct netvsc_stats has a member s_sync
> of type u64_stats_sync.
> This definition will break kernel build as the macro
> netdev_alloc_pcpu_stats requires this member name to be syncp.
> (see netdev_alloc_pcpu_stats definition in ./include/linux/netdevice.h)
> 
> This patch changes netvsc_stats's member name from s_sync to syncp to fix
> the build break.
> 
> Signed-off-by: Simon Xiao 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set

2015-05-17 Thread Mike Galbraith

On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
> 
> > Given that kernel initiated association to isolcpus, a user turning
> > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > he/she does not have CPUSETS enabled, or should Rik's patch rendering
> > isolcpus immutable be merged, 
> 
> My patch does not aim to make isolcpus immutable, it aims to make
> isolcpus resistent to system management tools (like libvirt)
> automatically undoing isolcpus the instant a cpuset with the default
> cpus (inherited from the root group) is created.

Aim or not, if cpusets is the sole modifier, it'll render isolcpus
immutable, no?  Cpusets could grow an override to the override I
suppose, to regain control of the resource it thinks it manages.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: 8250_uniphier: add UniPhier serial driver

2015-05-17 Thread Masahiro Yamada

Hi Matthias,


2015-05-16 18:17 GMT+09:00 Matthias Brugger :

>> +/*
>> + * The register map is slightly different from that of 8250.
>> + * IO callbacks must be overridden for correct access to FCR, LCR, and MCR.
>> + */
>> +static unsigned int uniphier_serial_in(struct uart_port *p, int offset)
>> +{
>> +   int valshift = 0;
>> +
>> +   switch (offset) {
>> +   case UART_LCR:
>> +   valshift = 8;
>
> Please use a define for the value. Something like UNIPHIER_UART_LCR_SHIFT 
> maybe.
>


Will do in v2.

Thank you!


-- 
Best Regards
Masahiro Yamada
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: 8250_uniphier: add UniPhier serial driver

2015-05-17 Thread Masahiro Yamada

Hi Joachim,




2015-05-16 7:28 GMT+09:00 Joachim  Eastwood :

>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>
> Please put the includes in alphabetic order.

OK.



> Do you really need init.h?

Not really.  Will remove it.


>> +static int uniphier_of_serial_setup(struct platform_device *pdev,
>> +   struct uart_port *port,
>> +   struct uniphier8250_priv *priv)
>> +{
>> +   int ret;
>> +   u32 prop;
>> +   struct device_node *np = pdev->dev.of_node;
>> +
>> +   ret = of_alias_get_id(np, "serial");
>> +   if (ret < 0) {
>> +   dev_err(>dev, "failed to get alias id\n");
>> +   return ret;
>> +   }
>> +   port->line = priv->line = ret;
>> +
>> +   /* Get clk rate through clk driver */
>> +   priv->clk = of_clk_get(np, 0);
>> +   if (IS_ERR(priv->clk)) {
>> +   dev_err(>dev, "failed to get clock\n");
>> +   return PTR_ERR(priv->clk);
>> +   }
>
> Use devm_clk_get() if possible.

Sure.



>
>> +static int uniphier_uart_remove(struct platform_device *pdev)
>> +{
>> +   struct uniphier8250_priv *priv = platform_get_drvdata(pdev);
>> +
>> +   serial8250_unregister_port(priv->line);
>> +   if (!IS_ERR_OR_NULL(priv->clk)) {
>> +   clk_disable_unprepare(priv->clk);
>> +   clk_put(priv->clk);
>> +   }
>
> If you use devm_clk_get() in uniphier_of_serial_setup() you only need
> to call clk_disable_unprepare() here.
>
> Calling clk_disable_unprepare() with NULL is allowed and you already
> check for IS_ERR in your setup function.



Yes.

Thank you for your review!



-- 
Best Regards
Masahiro Yamada
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v1 01/11] genirq: Introduce struct irq_common_data to host shared irq data

2015-05-17 Thread Jiang Liu

On 2015/5/8 10:23, Yun Wu (Abel) wrote:
> On 2015/5/4 11:15, Jiang Liu wrote:
> 
> [...]
>> diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
>> index dd1109fb241e..3010e99abf3e 100644
>> --- a/include/linux/irqdesc.h
>> +++ b/include/linux/irqdesc.h
>> @@ -47,6 +47,7 @@ struct pt_regs;
>>   * @name:   flow handler name for /proc/interrupts output
>>   */
>>  struct irq_desc {
>> +struct irq_common_data  irq_common_data;
> 
> Hi Gerry,
> 
> Please update description as well. :)
Thanks, Abel!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Linus Torvalds

On Sun, May 17, 2015 at 4:16 PM, NeilBrown  wrote:
>
> Just to be crystal clear about what I want:
>   I want the filesystem to be in control

Yeah, no. Not going to happen.

You seem to think that the dcache is "just" a cache. It's not. It's a
cache, but that is absolutely not all that it is. It's very much a
cache with strong semantics.

And no, we're not handing over those semantics over to the filesystem.
The dcache is not just a cache, it's the *primary* data structure that
we use for pathname validation, local security checking, and for doing
things like "getcwd()" and handling ".." etc.

So there's no way the filesystem is "in control". You as a filesystem
are not really even doing the actual pathname lookup. The *only* thing
you're doing is filling in the dcache. The actual real pathname lookup
is done by the VFS layer using the dcache data.

That's how it very fundamentally works.  It's *so* much more than a
cache - it really *is* the primary path lookup. The filesystem is the
slave in this relationship.

> The filesystem then uses generic helpers (or not) to find the answers and adds
> more current information to the cache.

You can do that already. There *are* those generic helpers to add data
to the cache. That's what "d_instantiate()" and friends _are_ for.

But no, you do *not* control name lookup. You get notified when
there's not enough data in the cache, and then you can fill it up any
which way you want.

You can populate the dcache with other entries than the one we asked
for, and you can ask the dcache to revalidate and throw dentries out.

But no, you do *not* get access to things like do_last() or to the
decision to follow symlinks or namespace rules, or mountpoints or
things like that.

> So for Al's example of revalidating multiple components at once, once the VFS
> gets to a point in the path where  d_revalidate says "I need more time",
> the VFS just passes the rest of the path to the filesystem.

That's bullshit,. for a very simple and basic reason: "the rest of the
path" is not necessarily at all for your filesystem!

Really. There might be mount-points, there might be symlinks, there
might be tons of stuff like that.

You're not getting control, for the very simple reason that IT IS NOT
YOUR DATA. And it really never ever will be.

Now, this is why I said we can do a "hint" style thing. Part of that
"hint" issue is very very much that it has no semantic meaning. You
can't screw it up, because if it turns out that the path component
we're looking up is a symlink and we actually end up in some other
filesystem, if you end up looking up the hint part, it just would
never actually get used.

So it's kind of like a prefetch for names. It's semantically much
weaker than saying "look up this name". The hint would be "this is
likely the next part of the name that the VFS layer will look up".

And the key part of that statement is
 (a) "likely" (it might not happen, and even if it does happen, it
migth not be for your filesystem)
and
 (b) "the VFS layer will look up" because it won't be the low-level
filesystem doing it.

So it would be the low-level filesystem pre-populating the dcache - if
the low-level filesystem decides the hint is worth using for that -
and the VFS layer then uses the data in the dcache without further
bothering the filesystem.

Exactly because the dcache is *so* much more than "just a cache".

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] arm:consider THUMB and BE endian kernel build

2015-05-17 Thread yalin wang


this patch fix the function in kernel_thread(),
when kernel is build as THUMB2 or BE8 endian, we should
also set the correct bit in CPSR, so that kernel can return to
the correct state to execute.
---
 arch/arm/kernel/process.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index f192a2a..9a7ab32 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -220,6 +220,12 @@ copy_thread(unsigned long clone_flags, unsigned 
long stack_start,

thread->cpu_context.r4 = stk_sz;
thread->cpu_context.r5 = stack_start;
childregs->ARM_cpsr = SVC_MODE;
+#ifdef CONFIG_THUMB2_KERNEL
+   childregs->ARM_cpsr |= PSR_T_BIT;
+#endif
+#ifdef CONFIG_CPU_ENDIAN_BE8
+   childregs->ARM_cpsr |= PSR_E_BIT;
+#endif
}
thread->cpu_context.pc = (unsigned long)ret_from_fork;
thread->cpu_context.sp = (unsigned long)childregs;
--
1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [v4 3/3] x86, irq: Define a global vector for VT-d Posted-Interrupts

2015-05-17 Thread Wu, Feng



> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Friday, May 15, 2015 9:27 PM
> To: Wu, Feng
> Cc: mi...@redhat.com; h...@zytor.com; linux-kernel@vger.kernel.org;
> jiang@linux.intel.com
> Subject: Re: [v4 3/3] x86, irq: Define a global vector for VT-d 
> Posted-Interrupts
> 
> On Thu, 30 Apr 2015, Feng Wu wrote:
> >  #ifdef CONFIG_HAVE_KVM
> > +void (*wakeup_handler_callback)(void);
> > +EXPORT_SYMBOL_GPL(wakeup_handler_callback);
> 
> The matching entry in a header file is going to come later again?

I will add the declaration in a header file in this patch.

> 
> >  /*
> >   * Handler for POSTED_INTERRUPT_VECTOR.
> >   */
> > @@ -256,6 +259,30 @@ __visible void smp_kvm_posted_intr_ipi(struct
> pt_regs *regs)
> >
> > set_irq_regs(old_regs);
> >  }
> > +
> > +/*
> > + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
> > + */
> > +__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
> > +{
> > +   struct pt_regs *old_regs = set_irq_regs(regs);
> > +
> > +   ack_APIC_irq();
> > +
> > +   irq_enter();
> > +
> > +   exit_idle();
> 
>   entering_ack_irq() please

Good idea!

Thanks,
Feng

> 
> > +   inc_irq_stat(kvm_posted_intr_wakeup_ipis);
> > +
> > +   if (wakeup_handler_callback)
> > +   wakeup_handler_callback();
> > +
> > +   irq_exit();
> > +
> > +   set_irq_regs(old_regs);
> > +}
> 
> Thanks,
> 
>   tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

2015-05-17 Thread Wu, Feng

Thanks for the review!

> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Friday, May 15, 2015 9:18 PM
> To: Wu, Feng
> Cc: mi...@redhat.com; h...@zytor.com; linux-kernel@vger.kernel.org;
> jiang@linux.intel.com
> Subject: Re: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an
> interrupt to a VCPU
> 
> On Thu, 30 Apr 2015, Feng Wu wrote:
> >
> > Signed-off-by: Jiang Liu 
> 
> So I assume Jiang is the author, right?

Oh, yes, I think I made some mistakes while applying the patches. Thanks
for pointing this out!

> 
> > Signed-off-by: Feng Wu 
> 
> >  /**
> > + * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent
> interrupt
> > + * @data:  Pointer to interrupt specific data
> > + * @dest:  The vcpu affinity information
> > + */
> > +int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void
> *vcpu_info)
> > +{
> > +   data = data->parent_data;
> > +   if (data->chip->irq_set_vcpu_affinity)
> > +   return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
> > +
> > +   return -ENOSYS;
> > +}
> 
> That needs a prototype in irq.h, methinks
> 
> > +int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
> > +{
> > +   struct irq_desc *desc = irq_to_desc(irq);
> 
>   irq_get_desc_lock() please
> 
> > +   struct irq_chip *chip;
> > +   unsigned long flags;
> > +   int ret = -ENOSYS;
> > +
> > +   if (!desc)
> > +   return -EINVAL;
> > +
> > +   raw_spin_lock_irqsave(>lock, flags);
> > +   chip = desc->irq_data.chip;
> > +   if (chip && chip->irq_set_vcpu_affinity)
> > +   ret = chip->irq_set_vcpu_affinity(irq_desc_get_irq_data(desc),
> 
> Above you fiddle with desc->irq_data directly. Why using the accessor here?

I will only use one style here.

Thanks,
Feng

> 
> > + vcpu_info);
> > +   raw_spin_unlock_irqrestore(>lock, flags);
> 
> Otherwise this looks good.
> 
> Thanks,
> 
>   tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FYI] tux3: Core changes

2015-05-17 Thread Rik van Riel

On 05/17/2015 09:26 AM, Boaz Harrosh wrote:
> On 05/14/2015 03:59 PM, Rik van Riel wrote:
>> On 05/14/2015 04:26 AM, Daniel Phillips wrote:
>>> Hi Rik,
> <>
>>
>> The issue is that things like ptrace, AIO, infiniband
>> RDMA, and other direct memory access subsystems can take
>> a reference to page A, which Tux3 clones into a new page B
>> when the process writes it.
>>
>> However, while the process now points at page B, ptrace,
>> AIO, infiniband, etc will still be pointing at page A.
>>
> 
> All these problems can also happen with truncate+new-extending-write
> 
> It is the responsibility of the application to take file/range locks
> to prevent these page-pinned problems.

It is unreasonable to expect a process that is being ptraced
(potentially without its knowledge) to take special measures
to protect the ptraced memory from disappearing.

It is impossible for the debugger to take those special measures
for anonymous memory, or unlinked inodes.

I don't think your requirement is workable or reasonable.

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set

2015-05-17 Thread Rik van Riel

On 05/17/2015 01:30 AM, Mike Galbraith wrote:

> Given that kernel initiated association to isolcpus, a user turning
> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> isolcpus immutable be merged, 

My patch does not aim to make isolcpus immutable, it aims to make
isolcpus resistent to system management tools (like libvirt)
automatically undoing isolcpus the instant a cpuset with the default
cpus (inherited from the root group) is created.

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

2015-05-17 Thread NeilBrown

On Sat, 16 May 2015 19:42:54 +0200 Patrick Marlier
 wrote:

> 
> 
> On 05/13/2015 04:58 AM, NeilBrown wrote:
> > On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt  
> > wrote:
> >
> >> On Tue, 12 May 2015 15:46:26 -0700
> >> "Paul E. McKenney"  wrote:
> >>
> >>> From: Patrick Marlier 
> >>>
> >>> Signed-off-by: Patrick Marlier 
> >>> Signed-off-by: Paul E. McKenney 
> >>> ---
> >>>   drivers/md/bitmap.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> >>> index 2bc56e2a3526..32901772e4ee 100644
> >>> --- a/drivers/md/bitmap.c
> >>> +++ b/drivers/md/bitmap.c
> >>> @@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct 
> >>> md_rdev *rdev, struct mddev *mdde
> >>>   rcu_read_lock();
> >>>   if (rdev == NULL)
> >>>   /* start at the beginning */
> >>> - rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> >>> + rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, 
> >>> same_set);
> >>
> >> Hmm, this changes the semantics.
> >>
> >> The original code looks nasty, I first thought it was broken, but it
> >> seems to work out of sheer luck (or clever hack)
> >
> > Definitely a clever hack - no question of "luck" here :-)
> >
> > It might makes sense to change it to use list_for_each_entry_from_rcu()
> >
> >if (rdev == NULL)
> >   rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);
> >else {
> >   rdev_dec_pending(rdev, mddev);
> >   rdev = list_next_entry_rcu(rdev->same_set.next, struct md_rdev, 
> > same_set);
> >}
> >list_for_each_entry_from_rcu(rdev, )
> >
> > but there isn't a "list_next_entry_rcu"
> >
> >
> > Also, it would have been polity to at least 'cc' them Maintainer of this 
> > code
> > in the original patch - no?
> 
> Sure my bad. I hesitated to CC maintainers. I was almost sure that it 
> will be rejected so I wanted to avoid noise.

Well... If the subject has contained the magic string "RFC" I might have been
less concerned.
But there have been enough times that people have changed md without telling
me, and thereby broken it, that I'd much rather  see the patch than not.


> 
> 
> >
> > Thanks,
> > NeilBrown
> >
> >>
> >>>   else {
> >>>   /* release the previous rdev and start from there. */
> >>>   rdev_dec_pending(rdev, mddev);
> >>
> >>
> >> What comes after this is:
> >>
> >>list_for_each_entry_continue_rcu(rdev, >disks, same_set) {
> >>if (rdev->raid_disk >= 0 &&
> >>
> >> Now the original code had:
> >>
> >>rdev = list_entry_rcu(>disks, struct md_rdev, same_set);
> >>
> >> Where >disks would return the address of the disks field of
> >> mddev which is a list head. Then it would get the 'same_set' offset,
> >> which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
> >> isn't used, as the list_for_each_entry_continue_rcu() has:
> >>
> >> #define list_for_each_entry_continue_rcu(pos, head, member)
> >> \
> >>for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
> >> >member != (head);\
> >> pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> >>
> >> Thus the first use of pos is pos->member.next or:
> >>
> >>mddev->disks.next
> >>
> >> But now you converted it to rdev = mddev->disks.next, which means the
> >> first use is:
> >>
> >>pos = mddev->disks.next->next
> >>
> >> I think you are skipping the first element here.
> 
> 
> struct mddev {
> ...
>   struct list_headdisks;
> ...}
> 
> struct list_head {
>  struct list_head *next, *prev;
> };
> 
> The tricky thing is that "list_entry_rcu" before and after the patch is 
> reading the same thing.

No it isn't.
Before the patch it is passed the address of the 'next' field.  After the
patch it is passed the contents of the 'next' field.


> 
> However in your case, the change I proposed is probably wrong I trust 
> you on this side. :) What's your proposal to fix it with the rculist patch?

What needs fixing?  I don't see anything broken.

Maybe there is something in this "rculist patch" that I'm missing.  Can you
point me at it?

Thanks,
NeilBrown


> 
> PS: In the rculist patch I proposed, I avoid the store and the atomic 
> reload in the stack variable __ptr. (yeap, the 
> rcu_dereference_raw/ACCESS_ONCE is a bit confusing because it implicitly 
> do & on the parameter).
> 
> Thanks.
> --
> Pat
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



pgp3v3UEoN5r7.pgp
Description: OpenPGP digital signature

Re: [PATCH 0/7 V2] workqueue: cleanup for attr management

2015-05-17 Thread Lai Jiangshan

On 05/18/2015 09:26 AM, Tejun Heo wrote:
> On Mon, May 18, 2015 at 08:39:21AM +0800, Lai Jiangshan wrote:
>> ping
> 
> Does this reflect the comments from the previous review cycle?
> 

This is the V2 version of the V1 pathset.  But it is just the updated
version of the patch1&2 of the V1 patchset.

It doesn't contains the fix-up patch for wq_[nice|cpumask|numa]_store(),
so I can say it reflects all the comments except the name of the function
"get_node_unbound_pwq()" (patch was sent earlier than your replied).
(I wish I can get more comments before the next version).

The fix-up patch for wq_[nice|cpumask|numa]_store() is so important,
should I directly send a patchset for it (including the patch1&2 of this V2 
patchset)?
(and delay or even drop the "get_alloc_node_unbound_pwq()").

Thanks,
Lai.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] suspend: delete sys_sync()

2015-05-17 Thread NeilBrown

On Fri, 15 May 2015 11:35:57 +0100 One Thousand Gnomes
 wrote:

> > > Data loss may be caused for hotplug storage(like USB), or all storage
> > > when power is exhausted during suspend.
> > 
> > Which also may very well happen at run time, right?
> 
> Intuitively users treat "suspended" as a bit like off and do remove
> devices they've "finished with".
> 
> Technically yes the instant off/on on a phone is the same as the suspend
> to RAM on a laptop but it doesn't mean the model in people's heads is the
> same... not yet anyway.
> 
> > > Is there obvious advantage to remove sys_sync() in the case?
> > 
> > Yes, there is.  It is not necessary to sync() every time you suspend
> > if you do that very often.
> 
> But if you do it very often you won't have any dirty pages to flush so it
> will be very fast.

Several people have said things like this, and yet Len clearly saw a problem
so sometimes it isn't as fast as you think it should be.  I guess we need
some data.

In fact there seem to be a number of open questions:

1/ Len has seen enough delays to bother sending a patch.  Twice.  Lots of
   other people claim there shouldn't be much delay.

   Len: can you provide numbers?  How long does the sys_sync take
   (min/max/mean).  I think an interesting number would be in a quick
   "resume, do something, suspend" cycle, what percent of the time is spent
   in sys_sync.
   Maybe also report what filesystems are in use, and whether you expect
   there to be any dirty data.

   If I run "time sync; time sync; time sync" it reports about 50msec for
   each sync.  If I run "time sleep 0; time sleep 0; time sleep 0" it reports
   about 2 msec.  So sys_sync is taking at least 50msec.
   Almost all of that is in sync_inodes_sb() and much of that is in
   _raw_spin_lock though I might be misinterpreting perf.  It seems to
   wait for a BDI flusher thread to go off and do nothing.

   Len: is 50msec enough to bother you?

2/ Is a 'sys_sync' really necessary.  People claim that "users might lose
   data".  I claim that some user data could be still in the application
   buffers so they would lose that anyway, and applications call fsync after
   writing everything out, so sys_sync isn't necessary.
   Does anyone have an example of an application which writes important
   user data, but doesn't call "fsync" shortly after all data has been written
   out of the application's internal buffers?

3/ Is a 'sys_sync' sufficient.  Len claims that in some cases, when running
 sync; sync
   the second sync takes longer, suggesting that the first sync didn't do
   everything that was expected.   Prior to Linux 1.3.20, sys_sync didn't
   wait for everything.  Since then it seems to be designed to, but maybe
   something isn't right there.
   In that case, having sys_sync may not be helping anyway.

   Len: can you reliably reproduce this?  Can you provide a recipe?

4/ Which part of 'sync' is really important?
   sys_sync does two things.  It initiates writeout on dirty data and it
   wait for writeout to complete.  Even if the former isn't necessary, the
   latter probably is.  Do we need to reliably wait for all output queues
   to flush?  Where is the best place to do that?

Thanks,
NeilBrown

> 
> > And it is done in such a place that everything needs to wait for it to 
> > complete.
> 
> Only because the code deciding to trigger any automated suspend doesn't
> do a sync a few seconds before. In the case the user goes to the menus
> and does power->suspend then yes it's a delay. In the case where the OS
> at some level has decided that it's 10 seconds from automatically
> suspending to something the user space can issue a pre-emptive sync to
> get the queue size down.
> 
> Alan

pgpIOVBdOCU28.pgp
Description: OpenPGP digital signature

Re: [PATCH v1 1/1] ARM: EXYNOS: fix DEBUG_LL on Cortex-A7.

2015-05-17 Thread Chanho Park

Hi Tarek,

On Sun, May 17, 2015 at 3:42 PM, Tarek Dakhran  wrote:
> Cortex-A7 has EXYNOS5_PA_UART base address for UART.
> If system boots from Cortex-A7 CPU addruart loads wrong
> address. This patch fixex this.
>
> Signed-off-by: Tarek Dakhran 
> ---
>  arch/arm/include/debug/exynos.S | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm/include/debug/exynos.S b/arch/arm/include/debug/exynos.S
> index b17fdb7..a61b3ea 100644
> --- a/arch/arm/include/debug/exynos.S
> +++ b/arch/arm/include/debug/exynos.S
> @@ -24,6 +24,7 @@
> mrc p15, 0, \tmp, c0, c0, 0
> and \tmp, \tmp, #0xf0
> teq \tmp, #0xf0 @@ A15
> +   teqne   \tmp, #0x70 @@ A7
> ldreq   \rp, =EXYNOS5_PA_UART
> movne   \rp, #EXYNOS4_PA_UART   @@ EXYNOS4
> ldr \rv, =S3C_VA_UART
> --
> 1.9.1
>

Please see previous Joonyoung's patch. Your patch could break
exynos3250 because it has not only two cortex-a7 cpus but also uses
EXYOS4_PA_UART.

http://www.spinics.net/lists/linux-samsung-soc/msg37318.html

-- 
Best Regards,
Chanho Park
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: suspend regression in 4.1-rc1

2015-05-17 Thread Sergey Senozhatsky

On (05/17/15 20:50), Michal Hocko wrote:
> Hi,
> s2ram broke after 4.1-rc1 for me. The second s2ram simply doesn't wake
> up (fans turn on but the screen is off). I have even noticed fans
> starting also while suspended in some instances (which was especially
> annoying when it happened on the way home from work).
> I've tried /sys/power/pm_test and the issue starts at processors mode.
> Nothing really interesting shows up in the netconsole but I didn't get
> to a more detailed testing there.
> 
> I've tried to bisect this as 4.0 works reliably. This was tricky though
> because the first bad commit is a merge:
> 

Hello,

JFI, I see quite similar behaviour on my laptop (linux-next doesn't boot:
blank screen and fans on). I've tried to bisect yesterday, but it didn't
go well, showing that the first bad commit is '31ccd0e66d41' (nonsense);
it seems that the root cause is somewhere between next-20150505 (ok) and
next-20150506 (boot failure). I'll continue bisecting.

-ss

> commit 1dcf58d6e6e6eb7ec10e9abc56887b040205b06f
> Merge: 80dcc31fbe55 e4b0db72be24
> Author: Linus Torvalds 
> Date:   Tue Apr 14 16:49:17 2015 -0700
> 
> Merge branch 'akpm' (patches from Andrew)
> 
> The merge commit is empty and both 80dcc31fbe55 and e4b0db72be24 work
> properly but the merge is bad. So it seems like some of the commits in
> either branch has a side effect which needs other branch in order to
> reproduce.
> 
> So've tried to bisect ^80dcc31fbe55 e4b0db72be24 and merged 80dcc31fbe55
> in each step. This lead to:
> 
> commit 195daf665a6299de98a4da3843fed2dd9de19d3a
> Author: Ulrich Obergfell 
> Date:   Tue Apr 14 15:44:13 2015 -0700
> 
> watchdog: enable the new user interface of the watchdog mechanism
> 
> The patch doesn't revert because of follow up changes so I have reverted
> all three:
> 692297d8f968 ("watchdog: introduce the hardlockup_detector_disable() 
> function")
> b2f57c3a0df9 ("watchdog: clean up some function names and arguments")
> 195daf665a62 ("watchdog: enable the new user interface of the watchdog 
> mechanism")
> 
> on top of my current Linus tree (4cfceaf0c087f47033f5e61a801f4136d6fb68c6)
> and the issue is gone. I have hard time to understand what these 3 could have
> to do with suspend path, though.
> 
> Then I've tried to bisect the other branch and merge 195daf665a62 during
> each step to find out which patch starts failing. This lead to an even
> weirder commit a1e12da4796a ("perf tools: Add 'I' event modifier for
> exclude_idle bit") but maybe I've just screwed something on the way.
> 
> I will continue debugging tomorrow but any hints would be helpful.
> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf tools: Fix "Command" sort_entry's cmp and collapse function

2015-05-17 Thread Namhyung Kim

Hi Jiri,

CC-ing Frederic as he wrote the comm change.

On Fri, May 15, 2015 at 05:54:28PM +0200, Jiri Olsa wrote:
> Currently the se_cmp and se_collapse use pointer comparison,
> which is ok for for testing equality of strings. It's not ok
> as comparing function for rbtree insertion, because it gives
> different results based on current pointer values.
> 
> We saw test 32 (hists cumulation test) failing based on different
> environment setup. Having all sort functions straightened fix the
> test for us.

Can you elaborate it?

AFAIK comm string is shared among threads so pointer comparison and
'strcmp == 0' should have same result..

Thanks,
Namhyung


> 
> Reported-by: Jan Stancek 
> Link: http://lkml.kernel.org/n/tip-tklp6y27bseqjibcwn0py...@git.kernel.org
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/util/sort.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 4593f36ecc4c..09d4696fd9a1 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -89,14 +89,14 @@ static int64_t
>  sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
>  {
>   /* Compare the addr that should be unique among comm */
> - return comm__str(right->comm) - comm__str(left->comm);
> + return strcmp(comm__str(right->comm), comm__str(left->comm));
>  }
>  
>  static int64_t
>  sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
>  {
>   /* Compare the addr that should be unique among comm */
> - return comm__str(right->comm) - comm__str(left->comm);
> + return strcmp(comm__str(right->comm), comm__str(left->comm));
>  }
>  
>  static int64_t
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 4/4] ARM: multi_v7_defconfig: Enable OHCI on Exynos

2015-05-17 Thread Krzysztof Kozlowski

From: Krzysztof Kozlowski 

Enable the USB OHCI driver for Exynos SoCs: S5PV210, Exynos4 family,
Exynos5250, Exynos5440 and Exynos542x/5800 family.

Signed-off-by: Krzysztof Kozlowski 
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 8fa7171b12f3..fe88429b9279 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -455,6 +455,7 @@ CONFIG_USB_ISP1760_HCD=y
 CONFIG_USB_OHCI_HCD=y
 CONFIG_USB_OHCI_HCD_STI=y
 CONFIG_USB_OHCI_HCD_PLATFORM=y
+CONFIG_USB_OHCI_EXYNOS=m
 CONFIG_USB_R8A66597_HCD=m
 CONFIG_USB_RENESAS_USBHS=m
 CONFIG_USB_STORAGE=y
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/4] ARM: multi_v7_defconfig: Enable CPU idle for Exynos

2015-05-17 Thread Krzysztof Kozlowski

Current Exynos CPU idle driver supports entering AFTR (Arm Off, Top
Running) mode on Exynos3250, Exynos4210 (coupled), Exynos4x12 and
Exynos5250. Enable it in default configuration to reduce energy
consumption.

Signed-off-by: Krzysztof Kozlowski 
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 76cbb81cbaa3..d524d2e9633c 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -117,6 +117,7 @@ CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_STAT_DETAILS=y
 CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
 CONFIG_CPU_IDLE=y
+CONFIG_ARM_EXYNOS_CPUIDLE=y
 CONFIG_NEON=y
 CONFIG_KERNEL_MODE_NEON=y
 CONFIG_ARM_ZYNQ_CPUIDLE=y
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/4] ARM: multi_v7_defconfig: Enable PMIC and MUIC drivers for Exynos boards

2015-05-17 Thread Krzysztof Kozlowski

Enable drivers for PMICs and MUICs present on Exynos-based devices:
 - max14577: charger, fuel gauge (max17040), regulator,
   used on: Gear 1, Gear 2,
 - max77693: charger, fuel gauge (max17042),
   used on: Trats2,
 - s5m/s2mps: RTC, clock,
   used on: Arndale, Arndale Octa, Gear 1, Gear 2

This allows full usage of charging stack on these devices along with RTC
and 32 kHz clocks.

Signed-off-by: Krzysztof Kozlowski 
---
 arch/arm/configs/multi_v7_defconfig | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index d524d2e9633c..f09ae2089284 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -328,6 +328,10 @@ CONFIG_GPIO_SYSCON=y
 CONFIG_GPIO_TPS6586X=y
 CONFIG_GPIO_TPS65910=y
 CONFIG_BATTERY_SBS=y
+CONFIG_CHARGER_MAX14577=m
+CONFIG_BATTERY_MAX17040=m
+CONFIG_BATTERY_MAX17042=m
+CONFIG_CHARGER_MAX77693=m
 CONFIG_CHARGER_TPS65090=y
 CONFIG_POWER_RESET_AS3722=y
 CONFIG_POWER_RESET_GPIO=y
@@ -357,8 +361,10 @@ CONFIG_MFD_AXP20X=y
 CONFIG_MFD_CROS_EC=y
 CONFIG_MFD_CROS_EC_I2C=m
 CONFIG_MFD_CROS_EC_SPI=y
+CONFIG_MFD_MAX14577=y
 CONFIG_MFD_MAX77686=y
 CONFIG_MFD_MAX8907=y
+CONFIG_MFD_MAX77693=y
 CONFIG_MFD_SEC_CORE=y
 CONFIG_MFD_STMPE=y
 CONFIG_MFD_PALMAS=y
@@ -374,9 +380,11 @@ CONFIG_REGULATOR_DA9210=y
 CONFIG_REGULATOR_GPIO=y
 CONFIG_MFD_SYSCON=y
 CONFIG_POWER_RESET_SYSCON=y
+CONFIG_REGULATOR_MAX14577=m
 CONFIG_REGULATOR_MAX8907=y
 CONFIG_REGULATOR_MAX8973=y
 CONFIG_REGULATOR_MAX77686=y
+CONFIG_REGULATOR_MAX77693=m
 CONFIG_REGULATOR_MAX77802=m
 CONFIG_REGULATOR_PALMAS=y
 CONFIG_REGULATOR_S2MPS11=y
@@ -529,6 +537,7 @@ CONFIG_RTC_DRV_SUN6I=y
 CONFIG_RTC_DRV_SUNXI=y
 CONFIG_RTC_DRV_MV=y
 CONFIG_RTC_DRV_TEGRA=y
+CONFIG_RTC_DRV_S5M=m
 CONFIG_DMADEVICES=y
 CONFIG_DW_DMAC=y
 CONFIG_MV_XOR=y
@@ -558,6 +567,7 @@ CONFIG_QCOM_GSBI=y
 CONFIG_COMMON_CLK_QCOM=y
 CONFIG_COMMON_CLK_MAX77686=y
 CONFIG_COMMON_CLK_MAX77802=m
+CONFIG_COMMON_CLK_S2MPS11=m
 CONFIG_APQ_MMCC_8084=y
 CONFIG_MSM_GCC_8660=y
 CONFIG_MSM_MMCC_8960=y
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 3/4] ARM: multi_v7_defconfig: Enable TMU for Exynos

2015-05-17 Thread Krzysztof Kozlowski

From: Krzysztof Kozlowski 

Enable support for Thermal Monitoring Unit present on Exynos SoCs. This
allows detection of overheat and handling this gracefully.

Signed-off-by: Krzysztof Kozlowski 
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index f09ae2089284..8fa7171b12f3 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -342,6 +342,7 @@ CONFIG_SENSORS_LM90=y
 CONFIG_SENSORS_LM95245=y
 CONFIG_THERMAL=y
 CONFIG_CPU_THERMAL=y
+CONFIG_EXYNOS_THERMAL=m
 CONFIG_RCAR_THERMAL=y
 CONFIG_ARMADA_THERMAL=y
 CONFIG_DAVINCI_WATCHDOG
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/4] ARM: multi_v7_defconfig: Stuff for Exynos

2015-05-17 Thread Krzysztof Kozlowski

Dear Kukjin,


Changes since v1:
=
1. Select drivers as modules, whenever possible. Suggested by Javier.
2. Patch 2: The I2C_GPIO is already enabled as module.


Description
===
The patchset enables various config options on multi_v7 config
for Exynos boards.

Arnd suggested [0] that this can go through your tree.

Patchset is rebased on next-20150515 and Javier's patchset [1]
(to avoid conflicts around regulators and clocks).

Please let me know if this should be rebased on other commit.

[0] http://www.spinics.net/lists/kernel/msg1991518.html
[1] http://www.spinics.net/lists/kernel/msg1990767.html


Best regards,
Krzysztof

Krzysztof Kozlowski (4):
  ARM: multi_v7_defconfig: Enable CPU idle for Exynos
  ARM: multi_v7_defconfig: Enable PMIC and MUIC drivers for Exynos
boards
  ARM: multi_v7_defconfig: Enable TMU for Exynos
  ARM: multi_v7_defconfig: Enable OHCI on Exynos

 arch/arm/configs/multi_v7_defconfig | 13 +
 1 file changed, 13 insertions(+)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] Documentation/arch: Add kernel feature descriptions and arch support status under Documentation/features/

2015-05-17 Thread Michael Ellerman

On Fri, 2015-05-15 at 09:49 +0200, Ingo Molnar wrote:
> * Michael Ellerman  wrote:
> 
> > On Thu, 2015-05-14 at 12:38 -0700, Andrew Morton wrote:
> > > > Add arch support matrices for more than 40 generic kernel features
> > > > that need per architecture support.
> > > > 
> > > > Each feature has its own directory under 
> > > > Documentation/features/feature_name/,
> > > > and the arch-support.txt file shows its current arch porting status.
> > > 
> > > It would be nice to provide people with commit IDs to look at, but the
> > > IDs won't be known at the time the documentation file is created.  We
> > > could provide patch titles.
> > 
> > +1 on patch titles.
> 
> Ok, I'll solve this.

Thanks.

> > > But still, let's not overdo it - get something in there, see how 
> > > well it works, evolve it over time.
> > > 
> > > I don't think we've heard from any (non-x86) arch maintainers?  Do 
> > > they consider this useful at all?  Poke.
> > 
> > Yes it is. I have my own version I've cobbled together for powerpc, 
> > but this is much better.
> 
> Please double check the PowerPC support matrix for correctness (if you 
> haven't yet):

It looks good except for:

>rwsem-optimized:  |  ok  |  Optimized 
> asm/rwsem.h #  arch provides optimized rwsem APIs

I don't see an rwsem.h in powerpc anywhere?


And this is correct but a bit confusing:

>  irq-time-acct:  |  ok  |   
> HAVE_IRQ_TIME_ACCOUNTING #  arch supports precise IRQ time accounting

I think you and Paul agreed it's "ok" on powerpc because we have
VIRT_CPU_ACCOUNTING instead, but that's not obvious.

> > I'd like to see more description in the individual files of what the
> > feature is, and preferably some pointers to what's needed to
> > implement it.
>
> Yeah, so I tried to add a short description to the feature file 
> itself, and for many of these features that single sentence is the 
> only documentation we have in the kernel source ...

Yep, so that's better than what we had, and we can always improve it.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Regression] Guest fs corruption with 'block: loop: improve performance via blk-mq'

2015-05-17 Thread Ming Lei

Hi Santosh,

Thanks for your report!

On Sun, May 17, 2015 at 4:13 AM, santosh shilimkar
 wrote:
> Hi Ming Lei, Jens,
>
> While doing few tests with recent kernels with Xen Server,
> we saw guests(DOMU) disk image getting corrupted while booting it.
> Strangely the issue is seen so far only with disk image over ocfs2
> volume. If the same image kept on the EXT3/4 drive, no corruption
> is observed. The issue is easily reproducible. You see the flurry
> of errors while guest is mounting the file systems.
>
> After doing some debug and bisects, we zeroed down the issue with
> commit "b5dd2f6 block: loop: improve performance via blk-mq". With
> that commit reverted the corruption goes away.
>
> Some more details on the test setup:
> 1. OVM(XEN) Server kernel(DOM0) upgraded to more recent kernel
> which includes commit b5dd2f6. Boot the Server.
> 2. On DOM0 file system create a ocfs2 volume
> 3. Keep the Guest(VM) disk image on ocfs2 volume.
> 4. Boot guest image. (xm create vm.cfg)

I am not familiar with xen, so is the image accessed via
loop block inside of guest VM? Is he loop block created
in DOM0 or guest VM?

> 5. Observe the VM boot console log. VM itself use the EXT3 fs.
> You will see errors like below and after this boot, that file
> system/disk-image gets corrupted and mostly won't boot next time.

OK, that means the image is corrupted by VM booting.

>
> Trimmed Guest kernel boot log...
> --->
> EXT3-fs (dm-0): using internal journal
> EXT3-fs: barriers not enabled
> kjournald starting.  Commit interval 5 seconds
> EXT3-fs (xvda1): using internal journal
> EXT3-fs (xvda1): mounted filesystem with ordered data mode
> Adding 1048572k swap on /dev/VolGroup00/LogVol01.  Priority:-1 extents:1
> across:1048572k
>
> [...]
>
> EXT3-fs error (device dm-0): ext3_xattr_block_get: inode 804966: bad block
> 843250
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620385
> JBD: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk
> of filesystem corruption in case of system crash.
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620392
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620394
>
> [...]
>
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620385
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620392
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620394
>
> [...]
>
> EXT3-fs error (device dm-0): ext3_add_entry: bad entry in directory #777661:
> rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
>
> [...]
>
> automount[2605]: segfault at 4 ip b7756dd6 sp b6ba8ab0 error 4 in
> ld-2.5.so[b774c000+1b000]
> EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap -
> block_group = 34, block = 1114112
> EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap -
> block_group = 0, block = 221
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 589841
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 589841
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 589841
> EXT3-fs error (device dm-0): ext3_xattr_block_get: inode 709252: bad block
> 370280
> ntpd[2691]: segfault at 2563352a ip b77e5000 sp bfe27cec error 6 in
> ntpd[b777d000+74000]
> EXT3-fs error (device dm-0): htree_dirblock_to_tree: bad entry in directory
> #618360: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
> name_len=0
> EXT3-fs error (device dm-0): ext3_add_entry: bad entry in directory #709178:
> rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
> EXT3-fs error (device dm-0): ext3_xattr_block_get: inode 368277: bad block
> 372184
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620392
> EXT3-fs error (device dm-0): ext3_lookup: deleted inode referenced: 620393
> 
>
> From the debug of the actual data on the disk vs what is read by
> the guest VM, we suspect the *reads* are actually not going all
> the way to disk and possibly returning the wrong data. Because
> the actual data on ocfs2 volume at those locations seems
> to be non-zero where as the guest seems to be read it as zero.

Two big changes in the patchset are: 1) use blk-mq request based IO;
2) submit I/O concurrently(write vs. write is still serialized)

Could you apply the patch in below link to see if it can fix the issue?
BTW, this patch only removes concurrent submission.

http://marc.info/?t=14309322324=1=2

>
> I tried few experiment without much success so far. One of the
> thing I suspected was "requests are now submitted to backend
> file/device concurrently so tried to move them under lo->lo_lock
> so that they get serialized. Also moved the blk_mq_start_request()
> inside the actual work like patch below. But it didn't help. Thought
> of reporting the issue to get more ideas on what could be going
> wrong. Thanks for help in

Re: [PATCH 0/7 V2] workqueue: cleanup for attr management

2015-05-17 Thread Tejun Heo

On Mon, May 18, 2015 at 08:39:21AM +0800, Lai Jiangshan wrote:
> ping

Does this reflect the comments from the previous review cycle?

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v10 4/4] cgroups: implement the PIDs subsystem

2015-05-17 Thread Tejun Heo

On Sat, May 16, 2015 at 01:59:09PM +1000, Aleksa Sarai wrote:
> One question RE: defaults for .config. What is the kernel policy for
> deciding if a particular subsystem should be made enabled-by-default?

Just default to N.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH Not-for-merge 40/40] perf tools: Disable thread refcount due to bug

2015-05-17 Thread Arnaldo Carvalho de Melo

Em Mon, May 18, 2015 at 09:30:55AM +0900, Namhyung Kim escreveu:
> This makes thread mg sharing test failed due to not decrement
> thread->refcnt on thread__put().

I fixed this one already:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/commit/?h=perf/core=8b00f46951bed1edd9c5cb9d9adb62d28bbe7623

No?

- Arnaldo

 
> Not-signed-off-by: Namhyung Kim 
> ---
>  tools/perf/util/thread.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> index 702f12dc5a90..dc5ec9a5cca1 100644
> --- a/tools/perf/util/thread.c
> +++ b/tools/perf/util/thread.c
> @@ -163,7 +163,7 @@ struct thread *thread__get(struct thread *thread)
>  
>  void thread__put(struct thread *thread)
>  {
> - if (thread && atomic_dec_and_test(>refcnt)) {
> + if (thread && atomic_dec_and_test(>refcnt) && 0) {
>   if (!RB_EMPTY_NODE(>rb_node)) {
>   struct machine *machine = thread->mg->machine;
>  
> -- 
> 2.4.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf tools: Fix dwarf-aux.c compilation on i386

2015-05-17 Thread Arnaldo Carvalho de Melo

Em Sat, May 16, 2015 at 08:21:49AM +0200, Jiri Olsa escreveu:
> On Fri, May 15, 2015 at 04:59:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Fri, May 15, 2015 at 06:23:11PM +0200, Jiri Olsa escreveu:
> > > Replacing %lu format strings for Dwarf_Addr type with PRIu64 as it
> > > fits for Dwarf_Addr (defined as uint64_t) type and works also on
> > > both 32/64 bits.

> > Please try to always indicate against which branch your patches
> > should be applied, be it as [PATCH perf/core], [PATCH perf/urgent] for
> > isolated patches, or on the cover letter [GIT PULL 0/N perf/core], or as
> > I do with Ingo, "[GIT PULL] perf/core fixes".

> > I thought, for this one, hey, its a fix, should go soon to Ingo!
> > Tried to apply to perf/urgent, and it didn't apply :-\

> > Applying to perf/core instead now.
 
> hum, I've never rebased against perf/urgent, I'll try
> to squeeze it in next time ;-)

In general, if it is a fix, try to check if it should be applied to
perf/urgent first, if that is not the case, then it is fixing a bug
introduced in perf/core, where it should go.

If you don't make it clear where to apply, then the burden is on me to
check, if I have to do it for everybody... it doesn't scale :-)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] add new strscpy() API for string copy

2015-05-17 Thread Michael Ellerman

On Fri, 2015-05-15 at 11:15 -0400, Chris Metcalf wrote:
> On 05/14/2015 07:10 PM, Michael Ellerman wrote:
> > On Thu, 2015-04-30 at 12:01 -0400, Chris Metcalf wrote:
> >>
> >> I tested the implementation with a simple user-space harness, so I
> >> believe it is correct for the corner cases I could think of.  In
> >> particular I pairwise-tested all the unaligned values of source and
> >> dest, and tested the restriction on src page-crossing at all
> >> unaligned offsets approaching the page boundary.
> > Can you please put that in tools/testing/selftests and merge it as part of 
> > the
> > series? That way I can run the tests and be confident it works on powerpc.
> 
> Unfortunately, the strscpy patch series only changes the one previous
> user of the API, which is a tile-architecture-only driver piece, not
> particularly useful for anyone else for testing.
> 
> The testing I did pulled strscpy() and word-at-a-time out into a
> separate, standalone userspace implementation, and tested it there,
> rather than doing tests through the syscall API like 
> tools/testing/selftests.

Not everything in selftests has to or does go through the syscall API.

We (powerpc) have tests of our memcpy/memcmp/load_unaligned_zeropad that are
built as standalone test programs.

Doing that for stuff in lib/string.c does look a bit complicated, because you'd
need to pull in a bunch of kernel headers.

Do you mind posting your test code somewhere so I can run it, and maybe I can
work out how to fold it into a selftest.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/40] perf tools: Create separate mmap for dummy tracking event

2015-05-17 Thread Namhyung Kim

When indexed data file support is enabled, a dummy tracking event will
be used to track metadata (like task, comm and mmap events) for a
session and actual samples will be recorded in separate (intermediate)
files and then merged (with index table).

Provide separate mmap to the dummy tracking event.  The size is fixed
to 128KiB (+ 1 page) as the event rate will be lower than samples.  I
originally wanted to use a single mmap for this but cross-cpu sharing
is prohibited so it's per-cpu (or per-task) like normal mmaps.

Cc: Adrian Hunter 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-record.c |  10 +++-
 tools/perf/util/evlist.c| 131 +++-
 tools/perf/util/evlist.h|  11 +++-
 3 files changed, 123 insertions(+), 29 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 5dfe91395617..7d7ef5a1b0a6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -71,7 +71,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 
 static int record__mmap_read(struct record *rec, int idx)
 {
-   struct perf_mmap *md = >evlist->mmap[idx];
+   struct perf_mmap *md = perf_evlist__mmap_desc(rec->evlist, idx);
u64 head = perf_mmap__read_head(md);
u64 old = md->prev;
unsigned char *data = md->base + page_size;
@@ -107,6 +107,7 @@ static int record__mmap_read(struct record *rec, int idx)
}
 
md->prev = old;
+
perf_evlist__mmap_consume(rec->evlist, idx);
 out:
return rc;
@@ -414,6 +415,13 @@ static int record__mmap_read_all(struct record *rec)
}
}
 
+   if (rec->evlist->track_mmap[i].base) {
+   if (record__mmap_read(rec, track_mmap_idx(i)) != 0) {
+   rc = -1;
+   goto out;
+   }
+   }
+
if (mm->base && !rec->opts.auxtrace_snapshot_mode &&
record__auxtrace_mmap_read(rec, mm) != 0) {
rc = -1;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5bbd0ea82fc4..5db249eb4dcd 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -28,6 +28,7 @@
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx);
 static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx);
+static void __perf_evlist__munmap_track(struct perf_evlist *evlist, int idx);
 
 #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
 #define SID(e, x, y) xyarray__entry(e->sample_id, x, y)
@@ -728,22 +729,39 @@ static bool perf_mmap__empty(struct perf_mmap *md)
return perf_mmap__read_head(md) == md->prev && !md->auxtrace_mmap.base;
 }
 
+struct perf_mmap *perf_evlist__mmap_desc(struct perf_evlist *evlist, int idx)
+{
+   if (idx >= 0)
+   return >mmap[idx];
+   else
+   return >track_mmap[track_mmap_idx(idx)];
+}
+
 static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
 {
-   ++evlist->mmap[idx].refcnt;
+   struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
+
+   ++md->refcnt;
 }
 
 static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
 {
-   BUG_ON(evlist->mmap[idx].refcnt == 0);
+   struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 
-   if (--evlist->mmap[idx].refcnt == 0)
+   BUG_ON(md->refcnt == 0);
+
+   if (--md->refcnt != 0)
+   return;
+
+   if (idx >= 0)
__perf_evlist__munmap(evlist, idx);
+   else
+   __perf_evlist__munmap_track(evlist, track_mmap_idx(idx));
 }
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-   struct perf_mmap *md = >mmap[idx];
+   struct perf_mmap *md = perf_evlist__mmap_desc(evlist, idx);
 
if (!evlist->overwrite) {
u64 old = md->prev;
@@ -793,6 +811,15 @@ static void __perf_evlist__munmap(struct perf_evlist 
*evlist, int idx)
auxtrace_mmap__munmap(>mmap[idx].auxtrace_mmap);
 }
 
+static void __perf_evlist__munmap_track(struct perf_evlist *evlist, int idx)
+{
+   if (evlist->track_mmap[idx].base != NULL) {
+   munmap(evlist->track_mmap[idx].base, TRACK_MMAP_SIZE);
+   evlist->track_mmap[idx].base = NULL;
+   evlist->track_mmap[idx].refcnt = 0;
+   }
+}
+
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
int i;
@@ -804,24 +831,44 @@ void perf_evlist__munmap(struct perf_evlist *evlist)
__perf_evlist__munmap(evlist, i);
 
zfree(>mmap);
+
+   if (evlist->track_mmap == NULL)
+   return;
+
+   for (i = 0; i < evlist->nr_mmaps; i++)
+   __perf_evlist__munmap_track(evlist, i);
+
+   zfree(>track_mmap);
 }
 
-static int perf_evlist__alloc_mmap(struct perf_evlist *evlist)
+static int perf_evlist__alloc_mmap(struct perf_evlist *evlist,

[PATCH 05/40] perf tools: Introduce perf_evlist__mmap_track()

2015-05-17 Thread Namhyung Kim

The perf_evlist__mmap_track function creates data mmaps and optionally
tracking mmaps for events.  It'll be used for perf record to save events
in a separate files and build an index table.  Checking dummy tracking
event in perf_evlist__mmap() alone is not enough as users can specify a
dummy event (like in keep tracking testcase) without the index option.

Cc: Adrian Hunter 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-record.c |  2 +-
 tools/perf/util/evlist.c| 13 -
 tools/perf/util/evlist.h|  2 +-
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7d7ef5a1b0a6..21f7edb23370 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -304,7 +304,7 @@ static int record__open(struct record *rec)
 
if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
 opts->auxtrace_mmap_pages,
-opts->auxtrace_snapshot_mode) < 0) {
+opts->auxtrace_snapshot_mode, false) < 0) {
if (errno == EPERM) {
pr_err("Permission error mapping pages.\n"
   "Consider increasing "
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 5db249eb4dcd..303249467672 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -863,6 +863,7 @@ static int perf_evlist__alloc_mmap(struct perf_evlist 
*evlist, bool track_mmap)
 
 struct mmap_params {
int prot;
+   booltrack;
size_t  len;
struct auxtrace_mmap_params auxtrace_mp;
 };
@@ -913,7 +914,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist 
*evlist, int idx,
 
fd = FD(evsel, cpu, thread);
 
-   if (perf_evsel__is_dummy_tracking(evsel)) {
+   if (mp->track && perf_evsel__is_dummy_tracking(evsel)) {
struct mmap_params track_mp = {
.prot   = mp->prot,
.len= TRACK_MMAP_SIZE,
@@ -971,7 +972,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist 
*evlist, int idx,
 thread);
}
 
-   if (perf_evsel__is_dummy_tracking(evsel)) {
+   if (mp->track && perf_evsel__is_dummy_tracking(evsel)) {
/* restore idx as normal idx (positive) */
idx = track_mmap_idx(idx);
}
@@ -1138,6 +1139,7 @@ int perf_evlist__parse_mmap_pages(const struct option 
*opt, const char *str,
  * @overwrite: overwrite older events?
  * @auxtrace_pages - auxtrace map length in pages
  * @auxtrace_overwrite - overwrite older auxtrace data?
+ * @use_track_mmap: use another mmaps to track meta events
  *
  * If @overwrite is %false the user needs to signal event consumption using
  * perf_mmap__write_tail().  Using perf_evlist__mmap_read() does this
@@ -1150,16 +1152,17 @@ int perf_evlist__parse_mmap_pages(const struct option 
*opt, const char *str,
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 bool overwrite, unsigned int auxtrace_pages,
-bool auxtrace_overwrite)
+bool auxtrace_overwrite, bool use_track_mmap)
 {
struct perf_evsel *evsel;
const struct cpu_map *cpus = evlist->cpus;
const struct thread_map *threads = evlist->threads;
struct mmap_params mp = {
.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
+   .track = use_track_mmap,
};
 
-   if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, true) < 0)
+   if (evlist->mmap == NULL && perf_evlist__alloc_mmap(evlist, mp.track) < 
0)
return -ENOMEM;
 
if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) 
< 0)
@@ -1189,7 +1192,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, 
unsigned int pages,
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
  bool overwrite)
 {
-   return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+   return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 335988577c18..27453338a8f5 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -135,7 +135,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 bool overwrite, unsigned int auxtrace_pages,
-bool auxtrace_overwrite);
+bool auxtrace_overwrite, bool use_track_mmap);
 int perf_evlist__mmap(struct perf_evlist *evlist,

[PATCH 10/40] perf tools: Introduce thread__comm(_str)_by_time() helpers

2015-05-17 Thread Namhyung Kim

When data file indexing is enabled, it processes all task, comm and mmap
events first and then goes to the sample events.  So all it sees is the
last comm of a thread although it has information at the time of sample.

Sort thread's comm by time so that it can find appropriate comm at the
sample time.  The thread__comm_by_time() will mostly work even if
PERF_SAMPLE_TIME bit is off since in that case, sample->time will be
-1 so it'll take the last comm anyway.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/thread.c | 33 -
 tools/perf/util/thread.h |  2 ++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 16c28a37a9e4..962558024415 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -123,6 +123,21 @@ struct comm *thread__exec_comm(const struct thread *thread)
return last;
 }
 
+struct comm *thread__comm_by_time(const struct thread *thread, u64 timestamp)
+{
+   struct comm *comm;
+
+   list_for_each_entry(comm, >comm_list, list) {
+   if (timestamp >= comm->start)
+   return comm;
+   }
+
+   if (list_empty(>comm_list))
+   return NULL;
+
+   return list_last_entry(>comm_list, struct comm, list);
+}
+
 int __thread__set_comm(struct thread *thread, const char *str, u64 timestamp,
   bool exec)
 {
@@ -138,7 +153,13 @@ int __thread__set_comm(struct thread *thread, const char 
*str, u64 timestamp,
new = comm__new(str, timestamp, exec);
if (!new)
return -ENOMEM;
-   list_add(>list, >comm_list);
+
+   /* sort by time */
+   list_for_each_entry(curr, >comm_list, list) {
+   if (timestamp >= curr->start)
+   break;
+   }
+   list_add_tail(>list, >list);
 
if (exec)
unwind__flush_access(thread);
@@ -159,6 +180,16 @@ const char *thread__comm_str(const struct thread *thread)
return comm__str(comm);
 }
 
+const char *thread__comm_str_by_time(const struct thread *thread, u64 
timestamp)
+{
+   const struct comm *comm = thread__comm_by_time(thread, timestamp);
+
+   if (!comm)
+   return NULL;
+
+   return comm__str(comm);
+}
+
 /* CHECKME: it should probably better return the max comm len from its comm 
list */
 int thread__comm_len(struct thread *thread)
 {
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index f33c48cfdaa0..903cfaf2628d 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -68,7 +68,9 @@ static inline int thread__set_comm(struct thread *thread, 
const char *comm,
 int thread__comm_len(struct thread *thread);
 struct comm *thread__comm(const struct thread *thread);
 struct comm *thread__exec_comm(const struct thread *thread);
+struct comm *thread__comm_by_time(const struct thread *thread, u64 timestamp);
 const char *thread__comm_str(const struct thread *thread);
+const char *thread__comm_str_by_time(const struct thread *thread, u64 
timestamp);
 void thread__insert_map(struct thread *thread, struct map *map);
 int thread__fork(struct thread *thread, struct thread *parent, u64 timestamp);
 size_t thread__fprintf(struct thread *thread, FILE *fp);
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/40] perf report: Skip dummy tracking event

2015-05-17 Thread Namhyung Kim

The dummy tracking event is only for tracking task/comom/mmap events
and has no sample data for itself.  So no need to report, just skip it.

Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c|  3 +++
 tools/perf/ui/browsers/hists.c | 30 --
 tools/perf/ui/gtk/hists.c  |  3 +++
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 92fca2157e5e..fee770935eab 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -323,6 +323,9 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist 
*evlist,
struct hists *hists = evsel__hists(pos);
const char *evname = perf_evsel__name(pos);
 
+   if (perf_evsel__is_dummy_tracking(pos))
+   continue;
+
if (symbol_conf.event_group &&
!perf_evsel__is_group_leader(pos))
continue;
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index f981cb8f0158..2cc18b693950 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2128,14 +2128,17 @@ static int perf_evsel_menu__run(struct perf_evsel_menu 
*menu,
return key;
 }
 
-static bool filter_group_entries(struct ui_browser *browser __maybe_unused,
-void *entry)
+static bool filter_entries(struct ui_browser *browser __maybe_unused,
+  void *entry)
 {
struct perf_evsel *evsel = list_entry(entry, struct perf_evsel, node);
 
if (symbol_conf.event_group && !perf_evsel__is_group_leader(evsel))
return true;
 
+   if (perf_evsel__is_dummy_tracking(evsel))
+   return true;
+
return false;
 }
 
@@ -2152,7 +2155,7 @@ static int __perf_evlist__tui_browse_hists(struct 
perf_evlist *evlist,
.refresh= ui_browser__list_head_refresh,
.seek   = ui_browser__list_head_seek,
.write  = perf_evsel_menu__write,
-   .filter = filter_group_entries,
+   .filter = filter_entries,
.nr_entries = nr_entries,
.priv   = evlist,
},
@@ -2179,21 +2182,22 @@ int perf_evlist__tui_browse_hists(struct perf_evlist 
*evlist, const char *help,
  struct perf_session_env *env)
 {
int nr_entries = evlist->nr_entries;
+   struct perf_evsel *first = perf_evlist__first(evlist);
+   struct perf_evsel *pos;
 
 single_entry:
if (nr_entries == 1) {
-   struct perf_evsel *first = perf_evlist__first(evlist);
-
return perf_evsel__hists_browse(first, nr_entries, help,
false, hbt, min_pcnt,
env);
}
 
if (symbol_conf.event_group) {
-   struct perf_evsel *pos;
 
nr_entries = 0;
evlist__for_each(evlist, pos) {
+   if (perf_evsel__is_dummy_tracking(pos))
+   continue;
if (perf_evsel__is_group_leader(pos))
nr_entries++;
}
@@ -2202,6 +2206,20 @@ int perf_evlist__tui_browse_hists(struct perf_evlist 
*evlist, const char *help,
goto single_entry;
}
 
+   evlist__for_each(evlist, pos) {
+   if (perf_evsel__is_dummy_tracking(pos))
+   nr_entries--;
+   }
+
+   if (nr_entries == 1) {
+   evlist__for_each(evlist, pos) {
+   if (!perf_evsel__is_dummy_tracking(pos)) {
+   first = pos;
+   goto single_entry;
+   }
+   }
+   }
+
return __perf_evlist__tui_browse_hists(evlist, nr_entries, help,
   hbt, min_pcnt, env);
 }
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585eed1e8..83a7ecd5cda8 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -317,6 +317,9 @@ int perf_evlist__gtk_browse_hists(struct perf_evlist 
*evlist,
char buf[512];
size_t size = sizeof(buf);
 
+   if (perf_evsel__is_dummy_tracking(pos))
+   continue;
+
if (symbol_conf.event_group) {
if (!perf_evsel__is_group_leader(pos))
continue;
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 13/40] perf tools: Convert dead thread list into rbtree

2015-05-17 Thread Namhyung Kim

Currently perf maintains dead threads in a linked list but this can be
a problem if someone needs to search from it especially in a large
session which might have many dead threads.  Convert it to a rbtree
like normal threads and it'll be used later with multi-file changes.

The list node is now used for chaining dead threads of same tid since
it's easier to handle such threads in time order.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/machine.c | 80 ++-
 tools/perf/util/machine.h |  2 +-
 tools/perf/util/thread.c  | 21 +++--
 tools/perf/util/thread.h  | 11 +++
 4 files changed, 96 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 34bf89f7f4f3..ae07b84a40f5 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -30,8 +30,8 @@ int machine__init(struct machine *machine, const char 
*root_dir, pid_t pid)
dsos__init(>kernel_dsos);
 
machine->threads = RB_ROOT;
+   machine->dead_threads = RB_ROOT;
pthread_rwlock_init(>threads_lock, NULL);
-   INIT_LIST_HEAD(>dead_threads);
machine->last_match = NULL;
 
machine->vdso_info = NULL;
@@ -93,6 +93,29 @@ static void dsos__delete(struct dsos *dsos)
}
 }
 
+static void machine__delete_dead_threads(struct machine *machine)
+{
+   struct rb_node *nd = rb_first(>dead_threads);
+
+   while (nd) {
+   struct thread *t = rb_entry(nd, struct thread, rb_node);
+   struct thread *pos;
+
+   nd = rb_next(nd);
+   rb_erase(>rb_node, >dead_threads);
+   RB_CLEAR_NODE(>rb_node);
+
+   while (!list_empty(>tid_node)) {
+   pos = list_first_entry(>tid_node,
+  struct thread, tid_node);
+   list_del_init(>tid_node);
+   thread__delete(pos);
+   }
+
+   thread__delete(t);
+   }
+}
+
 void machine__delete_threads(struct machine *machine)
 {
struct rb_node *nd;
@@ -106,6 +129,8 @@ void machine__delete_threads(struct machine *machine)
__machine__remove_thread(machine, t, false);
}
pthread_rwlock_unlock(>threads_lock);
+
+   machine__delete_dead_threads(machine);
 }
 
 void machine__exit(struct machine *machine)
@@ -1308,6 +1333,10 @@ int machine__process_mmap_event(struct machine *machine, 
union perf_event *event
 
 static void __machine__remove_thread(struct machine *machine, struct thread 
*th, bool lock)
 {
+   struct rb_node **p = >dead_threads.rb_node;
+   struct rb_node *parent = NULL;
+   struct thread *pos;
+
if (machine->last_match == th)
machine->last_match = NULL;
 
@@ -1316,15 +1345,43 @@ static void __machine__remove_thread(struct machine 
*machine, struct thread *th,
pthread_rwlock_wrlock(>threads_lock);
rb_erase(>rb_node, >threads);
RB_CLEAR_NODE(>rb_node);
+
+   th->dead = true;
+
+   /*
+* No need to have an additional reference for non-index file.
+*/
+   if (!perf_has_index) {
+   thread__put(th);
+   goto out;
+   }
+
/*
-* Move it first to the dead_threads list, then drop the reference,
-* if this is the last reference, then the thread__delete destructor
-* will be called and we will remove it from the dead_threads list.
+* For indexed file, We may have references to this (dead)
+* thread, as samples are processed after fork/exit events.
+* Just move them to a separate rbtree.
 */
-   list_add_tail(>node, >dead_threads);
+   while (*p != NULL) {
+   parent = *p;
+   pos = rb_entry(parent, struct thread, rb_node);
+
+   if (pos->tid == th->tid) {
+   list_add_tail(>tid_node, >tid_node);
+   goto out;
+   }
+
+   if (th->tid < pos->tid)
+   p = &(*p)->rb_left;
+   else
+   p = &(*p)->rb_right;
+   }
+
+   rb_link_node(>rb_node, parent, p);
+   rb_insert_color(>rb_node, >dead_threads);
+
+out:
if (lock)
pthread_rwlock_unlock(>threads_lock);
-   thread__put(th);
 }
 
 void machine__remove_thread(struct machine *machine, struct thread *th)
@@ -1826,7 +1883,7 @@ int machine__for_each_thread(struct machine *machine,
 void *priv)
 {
struct rb_node *nd;
-   struct thread *thread;
+   struct thread *thread, *pos;
int rc = 0;
 
for (nd = rb_first(>threads); nd; nd = rb_next(nd)) {
@@ -1836,10 +1893,17 @@ int machine__for_each_thread(struct machine *machine,
return rc;
}
 
-   list_for_each_entry(thread, >dead_threads, node) {
+   for

[PATCH 14/40] perf tools: Introduce machine__find*_thread_by_time()

2015-05-17 Thread Namhyung Kim

With data file indexing is enabled, it needs to search thread based on
sample time since sample processing is done after other (task, comm and
mmap) events are processed.  This can be a problem if a session is very
long and pid is recycled - in that case it'll only see the last one.

So keep thread start time in it, and search thread based on the time.
This patch introduces machine__find{,new}_thread_by_time() function
for this.  It'll first search current thread rbtree and then dead
thread tree and list.  If it couldn't find anyone, it'll create a new
thread.

The sample timestamp of 0 means that this is called from synthesized
event so just use current rbtree.  The timestamp will be -1 if sample
didn't record the timestamp so will see current threads automatically.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/dwarf-unwind.c |   8 +--
 tools/perf/tests/hists_common.c |   3 +-
 tools/perf/tests/hists_link.c   |   2 +-
 tools/perf/util/event.c |   6 +-
 tools/perf/util/machine.c   | 126 +++-
 tools/perf/util/machine.h   |  10 +++-
 tools/perf/util/thread.c|   4 ++
 tools/perf/util/thread.h|   1 +
 8 files changed, 148 insertions(+), 12 deletions(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 9b748e1ad46e..1926799bfcdb 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -16,10 +16,10 @@
 
 static int mmap_handler(struct perf_tool *tool __maybe_unused,
union perf_event *event,
-   struct perf_sample *sample __maybe_unused,
+   struct perf_sample *sample,
struct machine *machine)
 {
-   return machine__process_mmap2_event(machine, event, NULL);
+   return machine__process_mmap2_event(machine, event, sample);
 }
 
 static int init_live_machine(struct machine *machine)
@@ -66,12 +66,10 @@ static int unwind_entry(struct unwind_entry *entry, void 
*arg)
 __attribute__ ((noinline))
 static int unwind_thread(struct thread *thread)
 {
-   struct perf_sample sample;
+   struct perf_sample sample = { .time = -1ULL, };
unsigned long cnt = 0;
int err = -1;
 
-   memset(, 0, sizeof(sample));
-
if (test__arch_unwind_sample(, thread)) {
pr_debug("failed to get unwind sample\n");
goto out;
diff --git a/tools/perf/tests/hists_common.c b/tools/perf/tests/hists_common.c
index 456f884eb27b..2fd9cb71b258 100644
--- a/tools/perf/tests/hists_common.c
+++ b/tools/perf/tests/hists_common.c
@@ -80,6 +80,7 @@ static struct {
 struct machine *setup_fake_machine(struct machines *machines)
 {
struct machine *machine = machines__find(machines, HOST_KERNEL_ID);
+   struct perf_sample sample = { .time = -1ULL, };
size_t i;
 
if (machine == NULL) {
@@ -114,7 +115,7 @@ struct machine *setup_fake_machine(struct machines 
*machines)
strcpy(fake_mmap_event.mmap.filename,
   fake_mmap_info[i].filename);
 
-   machine__process_mmap_event(machine, _mmap_event, NULL);
+   machine__process_mmap_event(machine, _mmap_event, );
}
 
for (i = 0; i < ARRAY_SIZE(fake_symbols); i++) {
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 27bae90c9a95..cacc8617bf02 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -64,7 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
struct perf_evsel *evsel;
struct addr_location al;
struct hist_entry *he;
-   struct perf_sample sample = { .period = 1, };
+   struct perf_sample sample = { .period = 1, .time = -1ULL, };
size_t i = 0, k;
 
/*
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index a513a51f7330..819a2d75411c 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -9,6 +9,7 @@
 #include "strlist.h"
 #include "thread.h"
 #include "thread_map.h"
+#include "session.h"
 #include "symbol/kallsyms.h"
 
 static const char *perf_event__names[] = {
@@ -929,9 +930,10 @@ int perf_event__preprocess_sample(const union perf_event 
*event,
  struct perf_sample *sample)
 {
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
-   struct thread *thread = machine__findnew_thread(machine, sample->pid,
-   sample->tid);
+   struct thread *thread;
 
+   thread = machine__findnew_thread_by_time(machine, sample->pid,
+sample->tid, sample->time);
if (thread == NULL)
return -1;
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index ae07b84a40f5..9e2f4e8663d5 100644
--- a/tools/perf/util/machine.c
+++

[PATCH 16/40] perf tools: Reducing arguments of hist_entry_iter__add()

2015-05-17 Thread Namhyung Kim

The evsel and sample arguments are to set iter for later use.  As it
also receives an iter as another argument, just set them before
calling the function.

Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c   | 9 +
 tools/perf/builtin-top.c  | 7 ---
 tools/perf/tests/hists_cumulate.c | 6 --
 tools/perf/tests/hists_filter.c   | 4 +++-
 tools/perf/tests/hists_output.c   | 6 --
 tools/perf/util/hist.c| 8 ++--
 tools/perf/util/hist.h| 1 -
 7 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fee770935eab..decd9e8584b5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -139,8 +139,10 @@ static int process_sample_event(struct perf_tool *tool,
struct report *rep = container_of(tool, struct report, tool);
struct addr_location al;
struct hist_entry_iter iter = {
-   .hide_unresolved = rep->hide_unresolved,
-   .add_entry_cb = hist_iter__report_callback,
+   .evsel  = evsel,
+   .sample = sample,
+   .hide_unresolved= rep->hide_unresolved,
+   .add_entry_cb   = hist_iter__report_callback,
};
int ret = 0;
 
@@ -168,8 +170,7 @@ static int process_sample_event(struct perf_tool *tool,
if (al.map != NULL)
al.map->dso->hit = 1;
 
-   ret = hist_entry_iter__add(, , evsel, sample, rep->max_stack,
-  rep);
+   ret = hist_entry_iter__add(, , rep->max_stack, rep);
if (ret < 0)
pr_debug("problem adding hist entry, skipping event\n");
 out_put:
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index a19351728f0f..6b987424d015 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -775,7 +775,9 @@ static void perf_event__process_sample(struct perf_tool 
*tool,
if (al.sym == NULL || !al.sym->ignore) {
struct hists *hists = evsel__hists(evsel);
struct hist_entry_iter iter = {
-   .add_entry_cb = hist_iter__top_callback,
+   .evsel  = evsel,
+   .sample = sample,
+   .add_entry_cb   = hist_iter__top_callback,
};
 
if (symbol_conf.cumulate_callchain)
@@ -785,8 +787,7 @@ static void perf_event__process_sample(struct perf_tool 
*tool,
 
pthread_mutex_lock(>lock);
 
-   err = hist_entry_iter__add(, , evsel, sample,
-  top->max_stack, top);
+   err = hist_entry_iter__add(, , top->max_stack, top);
if (err < 0)
pr_err("Problem incrementing symbol period, skipping 
event\n");
 
diff --git a/tools/perf/tests/hists_cumulate.c 
b/tools/perf/tests/hists_cumulate.c
index 620f626e5b35..7d82c8be5e36 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -87,6 +87,8 @@ static int add_hist_entries(struct hists *hists, struct 
machine *machine)
},
};
struct hist_entry_iter iter = {
+   .evsel = evsel,
+   .sample = ,
.hide_unresolved = false,
};
 
@@ -104,8 +106,8 @@ static int add_hist_entries(struct hists *hists, struct 
machine *machine)
  ) < 0)
goto out;
 
-   if (hist_entry_iter__add(, , evsel, ,
-PERF_MAX_STACK_DEPTH, NULL) < 0) {
+   if (hist_entry_iter__add(, , PERF_MAX_STACK_DEPTH,
+NULL) < 0) {
addr_location__put();
goto out;
}
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 82e1ee52e024..ce48775e6ada 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -63,6 +63,8 @@ static int add_hist_entries(struct perf_evlist *evlist,
},
};
struct hist_entry_iter iter = {
+   .evsel = evsel,
+   .sample = ,
.ops = _iter_normal,
.hide_unresolved = false,
};
@@ -81,7 +83,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
  ) < 0)
goto out;
 
-   if (hist_entry_iter__add(, , evsel, ,
+   if (hist_entry_iter__add(, ,
 PERF_MAX_STACK_DEPTH, NULL) < 
0) {

[PATCH 17/40] perf tools: Maintain map groups list in a leader thread

2015-05-17 Thread Namhyung Kim

To support multi-threaded perf report, we need to maintain time-sorted
map groups.  Add ->mg_list member to struct thread and sort the list
by time.  Now leader threads have one more refcnt for map groups in
the list so also update the thread-mg-share test case.

Currently only add a new map groups when an exec (comm) event is
received.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/thread-mg-share.c |   7 ++-
 tools/perf/util/event.c|   2 +
 tools/perf/util/machine.c  |   4 +-
 tools/perf/util/map.c  |   3 ++
 tools/perf/util/map.h  |   2 +
 tools/perf/util/thread.c   | 108 -
 tools/perf/util/thread.h   |   3 ++
 7 files changed, 124 insertions(+), 5 deletions(-)

diff --git a/tools/perf/tests/thread-mg-share.c 
b/tools/perf/tests/thread-mg-share.c
index c0ed56f7efc6..50a2c68d6379 100644
--- a/tools/perf/tests/thread-mg-share.c
+++ b/tools/perf/tests/thread-mg-share.c
@@ -23,6 +23,9 @@ int test__thread_mg_share(void)
 * with several threads and checks they properly share and
 * maintain map groups info (struct map_groups).
 *
+* Note that a leader thread has one more refcnt for its
+* (current) map groups.
+*
 * thread group (pid: 0, tids: 0, 1, 2, 3)
 * other  group (pid: 4, tids: 4, 5)
*/
@@ -43,7 +46,7 @@ int test__thread_mg_share(void)
leader && t1 && t2 && t3 && other);
 
mg = leader->mg;
-   TEST_ASSERT_EQUAL("wrong refcnt", mg->refcnt, 4);
+   TEST_ASSERT_EQUAL("wrong refcnt", mg->refcnt, 5);
 
/* test the map groups pointer is shared */
TEST_ASSERT_VAL("map groups don't match", mg == t1->mg);
@@ -71,7 +74,7 @@ int test__thread_mg_share(void)
machine__remove_thread(machine, other_leader);
 
other_mg = other->mg;
-   TEST_ASSERT_EQUAL("wrong refcnt", other_mg->refcnt, 2);
+   TEST_ASSERT_EQUAL("wrong refcnt", other_mg->refcnt, 3);
 
TEST_ASSERT_VAL("map groups don't match", other_mg == other_leader->mg);
 
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 819a2d75411c..0ad76b06cd48 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -851,6 +851,8 @@ void thread__find_addr_map(struct thread *thread, u8 
cpumode,
return;
}
 
+   BUG_ON(mg == NULL);
+
if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
al->level = 'k';
mg = >kmaps;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9e2f4e8663d5..99fb14926351 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -341,7 +341,7 @@ static void machine__update_thread_pid(struct machine 
*machine,
goto out_err;
 
if (!leader->mg)
-   leader->mg = map_groups__new(machine);
+   thread__set_map_groups(leader, map_groups__new(machine), 0);
 
if (!leader->mg)
goto out_err;
@@ -358,7 +358,7 @@ static void machine__update_thread_pid(struct machine 
*machine,
if (!map_groups__empty(th->mg))
pr_err("Discarding thread maps for %d:%d\n",
   th->pid_, th->tid);
-   map_groups__delete(th->mg);
+   map_groups__put(th->mg);
}
 
th->mg = map_groups__get(leader->mg);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index cd0e335008b4..b794c3561995 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -427,6 +427,8 @@ void map_groups__init(struct map_groups *mg, struct machine 
*machine)
}
mg->machine = machine;
mg->refcnt = 1;
+   mg->timestamp = 0;
+   INIT_LIST_HEAD(>list);
 }
 
 static void maps__delete(struct rb_root *maps)
@@ -489,6 +491,7 @@ struct map_groups *map_groups__new(struct machine *machine)
 void map_groups__delete(struct map_groups *mg)
 {
map_groups__exit(mg);
+   list_del(>list);
free(mg);
 }
 
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 4e0c729841ab..074453d332dd 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -61,7 +61,9 @@ struct map_groups {
struct rb_root   maps[MAP__NR_TYPES];
struct list_head removed_maps[MAP__NR_TYPES];
struct machine   *machine;
+   u64  timestamp;
int  refcnt;
+   struct list_head list;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index c8c927488ea0..fc4e51afaf18 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -11,13 +11,76 @@
 #include "unwind.h"
 #include "machine.h"
 
+struct map_groups *thread__get_map_groups(struct thread *thread, u64 timestamp)
+{
+   struct map_groups *mg;
+   struct thread *leader = thread;
+
+

[PATCH 20/40] perf tools: Add a test case for timed map groups handling

2015-05-17 Thread Namhyung Kim

A test case for verifying thread->mg and ->mg_list handling during
time change and new thread__find_addr_map_by_time() and friends.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/Build|  1 +
 tools/perf/tests/builtin-test.c   |  4 ++
 tools/perf/tests/tests.h  |  1 +
 tools/perf/tests/thread-mg-time.c | 93 +++
 4 files changed, 99 insertions(+)
 create mode 100644 tools/perf/tests/thread-mg-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 5ad495823b49..cfd59c61bcd2 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -27,6 +27,7 @@ perf-y += mmap-thread-lookup.o
 perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += thread-lookup-time.o
+perf-y += thread-mg-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index e83c7ce1b38a..c5dbeb3d75b1 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -179,6 +179,10 @@ static struct test {
.func = test__thread_lookup_time,
},
{
+   .desc = "Test thread map group handling with time",
+   .func = test__thread_mg_time,
+   },
+   {
.func = NULL,
},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index e9aa78c3d8fc..a2e1f729ae23 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -63,6 +63,7 @@ int test__fdarray__add(void);
 int test__kmod_path__parse(void);
 int test__thread_comm(void);
 int test__thread_lookup_time(void);
+int test__thread_mg_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-mg-time.c 
b/tools/perf/tests/thread-mg-time.c
new file mode 100644
index ..841777125a64
--- /dev/null
+++ b/tools/perf/tests/thread-mg-time.c
@@ -0,0 +1,93 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+#define PERF_MAP_START  0x4
+
+int test__thread_mg_time(void)
+{
+   struct machines machines;
+   struct machine *machine;
+   struct thread *t;
+   struct map_groups *mg;
+   struct map *map, *old_map;
+   struct addr_location al = { .map = NULL, };
+
+   /*
+* This test is to check whether it can retrieve a correct map
+* for a given time.  When multi-file data storage is enabled,
+* those task/comm/mmap events are processed first so the
+* later sample should find a matching comm properly.
+*/
+   machines__init();
+   machine = 
+
+   /* this is needed to add/find map by time */
+   perf_has_index = true;
+
+   t = machine__findnew_thread(machine, 0, 0);
+   mg = t->mg;
+
+   map = dso__new_map("/usr/bin/perf");
+   map->start = PERF_MAP_START;
+   map->end = PERF_MAP_START + 0x1000;
+
+   thread__insert_map(t, map);
+
+   if (verbose > 1)
+   map_groups__fprintf(t->mg, stderr);
+
+   thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+ PERF_MAP_START, );
+
+   TEST_ASSERT_VAL("cannot find mapping for perf", al.map != NULL);
+   TEST_ASSERT_VAL("non matched mapping found", al.map == map);
+   TEST_ASSERT_VAL("incorrect map groups", al.map->groups == mg);
+   TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+   thread__find_addr_map_by_time(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+ PERF_MAP_START, , -1ULL);
+
+   TEST_ASSERT_VAL("cannot find timed mapping for perf", al.map != NULL);
+   TEST_ASSERT_VAL("non matched timed mapping", al.map == map);
+   TEST_ASSERT_VAL("incorrect timed map groups", al.map->groups == mg);
+   TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+
+   pr_debug("simulate EXEC event (generate new mg)\n");
+   __thread__set_comm(t, "perf-test", 1, true);
+
+   old_map = map;
+
+   map = dso__new_map("/usr/bin/perf-test");
+   map->start = PERF_MAP_START;
+   map->end = PERF_MAP_START + 0x2000;
+
+   thread__insert_map(t, map);
+
+   if (verbose > 1)
+   map_groups__fprintf(t->mg, stderr);
+
+   thread__find_addr_map(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+ PERF_MAP_START + 4, );
+
+   TEST_ASSERT_VAL("cannot find mapping for perf-test", al.map != NULL);
+   TEST_ASSERT_VAL("invalid mapping found", al.map == map);
+   TEST_ASSERT_VAL("incorrect map groups", al.map->groups != mg);
+   TEST_ASSERT_VAL("incorrect map groups", al.map->groups == t->mg);
+
+   pr_debug("searching map in the old mag groups\n");
+   thread__find_addr_map_by_time(t, PERF_RECORD_MISC_USER, MAP__FUNCTION,
+

[PATCH 19/40] perf callchain: Use thread__find_addr_location_by_time() and friends

2015-05-17 Thread Namhyung Kim

Find correct thread/map/symbol using proper functions.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/machine.c  | 25 
 .../util/scripting-engines/trace-event-python.c|  4 ++--
 tools/perf/util/session.h  |  1 -
 tools/perf/util/unwind-libdw.c | 12 ++
 tools/perf/util/unwind-libunwind.c | 27 +++---
 5 files changed, 39 insertions(+), 30 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 64719692ec77..17e900a6a3c3 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1672,15 +1672,17 @@ static int add_callchain_ip(struct thread *thread,
struct symbol **parent,
struct addr_location *root_al,
u8 *cpumode,
-   u64 ip)
+   u64 ip,
+   u64 timestamp)
 {
struct addr_location al;
 
al.filtered = 0;
al.sym = NULL;
if (!cpumode) {
-   thread__find_cpumode_addr_location(thread, MAP__FUNCTION,
-  ip, );
+   thread__find_cpumode_addr_location_by_time(thread,
+  MAP__FUNCTION, ip,
+  , timestamp);
} else {
if (ip >= PERF_CONTEXT_MAX) {
switch (ip) {
@@ -1705,8 +1707,9 @@ static int add_callchain_ip(struct thread *thread,
}
return 0;
}
-   thread__find_addr_location(thread, *cpumode, MAP__FUNCTION,
-  ip, );
+   thread__find_addr_location_by_time(thread, *cpumode,
+  MAP__FUNCTION, ip,
+  , timestamp);
}
 
if (al.sym != NULL) {
@@ -1848,7 +1851,8 @@ static int resolve_lbr_callchain_sample(struct thread 
*thread,
ip = lbr_stack->entries[0].to;
}
 
-   err = add_callchain_ip(thread, parent, root_al, 
, ip);
+   err = add_callchain_ip(thread, parent, root_al, 
, ip,
+  sample->time);
if (err)
return (err < 0) ? err : 0;
}
@@ -1869,6 +1873,7 @@ static int thread__resolve_callchain_sample(struct thread 
*thread,
struct ip_callchain *chain = sample->callchain;
int chain_nr = min(max_stack, (int)chain->nr);
u8 cpumode = PERF_RECORD_MISC_USER;
+   u64 timestamp = sample->time;
int i, j, err;
int skip_idx = -1;
int first_call = 0;
@@ -1934,10 +1939,11 @@ static int thread__resolve_callchain_sample(struct 
thread *thread,
 
for (i = 0; i < nr; i++) {
err = add_callchain_ip(thread, parent, root_al,
-  NULL, be[i].to);
+  NULL, be[i].to, timestamp);
if (!err)
err = add_callchain_ip(thread, parent, root_al,
-  NULL, be[i].from);
+  NULL, be[i].from,
+  timestamp);
if (err == -EINVAL)
break;
if (err)
@@ -1966,7 +1972,8 @@ static int thread__resolve_callchain_sample(struct thread 
*thread,
 #endif
ip = chain->ips[j];
 
-   err = add_callchain_ip(thread, parent, root_al, , ip);
+   err = add_callchain_ip(thread, parent, root_al, , ip,
+  timestamp);
 
if (err)
return (err < 0) ? err : 0;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 5544b8cdd1ee..9f51b0dbb087 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -304,8 +304,8 @@ static PyObject *get_field_numeric_entry(struct 
event_format *event,
 
 
 static PyObject *python_process_callchain(struct perf_sample *sample,
-struct perf_evsel *evsel,
-struct addr_location *al)
+ struct perf_evsel *evsel,
+ struct addr_location *al)
 {
PyObject *pylist;
 
diff --git a/tools/perf/util/session.h

Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-17 Thread Herbert Xu

On Sun, May 17, 2015 at 12:48:11PM +0200, Boris Brezillon wrote:
> 
> Yep, but then they shouldn't be declared with CRYPTO_ALG_ASYNC and as an
> ablkcipher algorithm (*Asynchronous* Block Cipher), right ?

Right.  They can still use ablkcipher but should clear the ASYNC
bit.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 11/40] perf tools: Add a test case for thread comm handling

2015-05-17 Thread Namhyung Kim

The new test case checks various thread comm handling like overridding
and time sorting.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/Build  |  1 +
 tools/perf/tests/builtin-test.c |  4 
 tools/perf/tests/perf-targz-src-pkg | 21 -
 tools/perf/tests/tests.h|  1 +
 tools/perf/tests/thread-comm.c  | 47 +
 5 files changed, 53 insertions(+), 21 deletions(-)
 delete mode 100755 tools/perf/tests/perf-targz-src-pkg
 create mode 100644 tools/perf/tests/thread-comm.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 6a8801b32017..78d29a3a6a97 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -24,6 +24,7 @@ perf-y += bp_signal_overflow.o
 perf-y += task-exit.o
 perf-y += sw-clock.o
 perf-y += mmap-thread-lookup.o
+perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index f42af98a5c16..372b6395a448 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -171,6 +171,10 @@ static struct test {
.func = test__kmod_path__parse,
},
{
+   .desc = "Test thread comm handling",
+   .func = test__thread_comm,
+   },
+   {
.func = NULL,
},
 };
diff --git a/tools/perf/tests/perf-targz-src-pkg 
b/tools/perf/tests/perf-targz-src-pkg
deleted file mode 100755
index 238aa3927c71..
--- a/tools/perf/tests/perf-targz-src-pkg
+++ /dev/null
@@ -1,21 +0,0 @@
-#!/bin/sh
-# Test one of the main kernel Makefile targets to generate a perf sources 
tarball
-# suitable for build outside the full kernel sources.
-#
-# This is to test that the tools/perf/MANIFEST file lists all the files needed 
to
-# be in such tarball, which sometimes gets broken when we move files around,
-# like when we made some files that were in tools/perf/ available to other 
tools/
-# codebases by moving it to tools/include/, etc.
-
-PERF=$1
-cd ${PERF}/../..
-make perf-targz-src-pkg > /dev/null
-TARBALL=$(ls -rt perf-*.tar.gz)
-TMP_DEST=$(mktemp -d)
-tar xf ${TARBALL} -C $TMP_DEST
-rm -f ${TARBALL}
-cd - > /dev/null
-make -C $TMP_DEST/perf*/tools/perf > /dev/null 2>&1
-RC=$?
-rm -rf ${TMP_DEST}
-exit $RC
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index a10eaf5c4767..aa269eff798a 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -61,6 +61,7 @@ int test__switch_tracking(void);
 int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__kmod_path__parse(void);
+int test__thread_comm(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-comm.c b/tools/perf/tests/thread-comm.c
new file mode 100644
index ..d146dedf63b4
--- /dev/null
+++ b/tools/perf/tests/thread-comm.c
@@ -0,0 +1,47 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "debug.h"
+
+int test__thread_comm(void)
+{
+   struct machines machines;
+   struct machine *machine;
+   struct thread *t;
+
+   /*
+* This test is to check whether it can retrieve a correct
+* comm for a given time.  When multi-file data storage is
+* enabled, those task/comm events are processed first so the
+* later sample should find a matching comm properly.
+*/
+   machines__init();
+   machine = 
+
+   t = machine__findnew_thread(machine, 100, 100);
+   TEST_ASSERT_VAL("wrong init thread comm",
+   !strcmp(thread__comm_str(t), ":100"));
+
+   thread__set_comm(t, "perf-test1", 1);
+   TEST_ASSERT_VAL("failed to override thread comm",
+   !strcmp(thread__comm_str(t), "perf-test1"));
+
+   thread__set_comm(t, "perf-test2", 2);
+   thread__set_comm(t, "perf-test3", 3);
+   thread__set_comm(t, "perf-test4", 4);
+
+   TEST_ASSERT_VAL("failed to find timed comm",
+   !strcmp(thread__comm_str_by_time(t, 2), 
"perf-test2"));
+   TEST_ASSERT_VAL("failed to find timed comm",
+   !strcmp(thread__comm_str_by_time(t, 35000), 
"perf-test3"));
+   TEST_ASSERT_VAL("failed to find timed comm",
+   !strcmp(thread__comm_str_by_time(t, 5), 
"perf-test4"));
+
+   thread__set_comm(t, "perf-test1.5", 15000);
+   TEST_ASSERT_VAL("failed to sort timed comm",
+   !strcmp(thread__comm_str_by_time(t, 15000), 
"perf-test1.5"));
+
+   machine__delete_threads(machine);
+   machines__exit();
+   return 0;
+}
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 21/40] perf tools: Save timestamp of a map creation

2015-05-17 Thread Namhyung Kim

It'll be used to support multiple maps on a same address like dlopen()
and/or JIT compile cases.

Cc: Stephane Eranian 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/dso.c |  2 +-
 tools/perf/util/machine.c | 28 
 tools/perf/util/machine.h |  2 +-
 tools/perf/util/map.c | 12 +++-
 tools/perf/util/map.h |  9 ++---
 tools/perf/util/probe-event.c |  2 +-
 tools/perf/util/symbol-elf.c  |  2 +-
 tools/perf/util/symbol.c  |  4 ++--
 8 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 13d9ae0bd15c..7078700233fa 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -747,7 +747,7 @@ struct map *dso__new_map(const char *name)
struct dso *dso = dso__new(name);
 
if (dso)
-   map = map__new2(0, dso, MAP__FUNCTION);
+   map = map__new2(0, dso, MAP__FUNCTION, 0);
 
return map;
 }
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 17e900a6a3c3..e3566365d320 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -668,7 +668,7 @@ int machine__process_itrace_start_event(struct machine 
*machine __maybe_unused,
 }
 
 struct map *machine__new_module(struct machine *machine, u64 start,
-   const char *filename)
+   const char *filename, u64 timestamp)
 {
struct map *map = NULL;
struct dso *dso;
@@ -686,7 +686,7 @@ struct map *machine__new_module(struct machine *machine, 
u64 start,
if (dso == NULL)
goto out;
 
-   map = map__new2(start, dso, MAP__FUNCTION);
+   map = map__new2(start, dso, MAP__FUNCTION, timestamp);
if (map == NULL)
goto out;
 
@@ -854,7 +854,7 @@ int __machine__create_kernel_maps(struct machine *machine, 
struct dso *kernel)
for (type = 0; type < MAP__NR_TYPES; ++type) {
struct kmap *kmap;
 
-   machine->vmlinux_maps[type] = map__new2(start, kernel, type);
+   machine->vmlinux_maps[type] = map__new2(start, kernel, type, 0);
if (machine->vmlinux_maps[type] == NULL)
return -1;
 
@@ -1155,7 +1155,7 @@ static int machine__create_module(void *arg, const char 
*name, u64 start)
struct machine *machine = arg;
struct map *map;
 
-   map = machine__new_module(machine, start, name);
+   map = machine__new_module(machine, start, name, 0);
if (map == NULL)
return -1;
 
@@ -1256,7 +1256,8 @@ static bool machine__uses_kcore(struct machine *machine)
 }
 
 static int machine__process_kernel_mmap_event(struct machine *machine,
- union perf_event *event)
+ union perf_event *event,
+ u64 timestamp)
 {
struct map *map;
char kmmap_prefix[PATH_MAX];
@@ -1279,7 +1280,7 @@ static int machine__process_kernel_mmap_event(struct 
machine *machine,
if (event->mmap.filename[0] == '/' ||
(!is_kernel_mmap && event->mmap.filename[0] == '[')) {
map = machine__new_module(machine, event->mmap.start,
- event->mmap.filename);
+ event->mmap.filename, timestamp);
if (map == NULL)
goto out_problem;
 
@@ -1343,7 +1344,7 @@ static int machine__process_kernel_mmap_event(struct 
machine *machine,
 
 int machine__process_mmap2_event(struct machine *machine,
 union perf_event *event,
-struct perf_sample *sample __maybe_unused)
+struct perf_sample *sample)
 {
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
struct thread *thread;
@@ -1356,7 +1357,8 @@ int machine__process_mmap2_event(struct machine *machine,
 
if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL ||
cpumode == PERF_RECORD_MISC_KERNEL) {
-   ret = machine__process_kernel_mmap_event(machine, event);
+   ret = machine__process_kernel_mmap_event(machine, event,
+sample->time);
if (ret < 0)
goto out_problem;
return 0;
@@ -1379,7 +1381,8 @@ int machine__process_mmap2_event(struct machine *machine,
event->mmap2.ino_generation,
event->mmap2.prot,
event->mmap2.flags,
-   event->mmap2.filename, type, thread);
+   event->mmap2.filename, type, thread,
+   sample->time);
 
if (map == NULL)
goto out_problem_map;
@@ -1396,7 +1399,7 @@ int machine__process_mmap2_event(struct machine

[PATCH 29/40] perf tools: Add dso__data_get/put_fd()

2015-05-17 Thread Namhyung Kim

Using dso__data_fd() in multi-thread environment is not safe since
returned fd can be closed and/or reused anytime.  So convert it to the
dso__data_get/put_fd() pair to protect the access with lock.

The original dso__data_fd() is deprecated and kept only for testing.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/dso.c  | 44 +-
 tools/perf/util/dso.h  |  9 ++--
 tools/perf/util/unwind-libunwind.c | 38 +++-
 3 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 8857287afc14..272b207f4bef 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -441,14 +441,15 @@ void dso__data_close(struct dso *dso)
 }
 
 /**
- * dso__data_fd - Get dso's data file descriptor
+ * dso__data_get_fd - Get dso's data file descriptor
  * @dso: dso object
  * @machine: machine object
  *
  * External interface to find dso's file, open it and
- * returns file descriptor.
+ * returns file descriptor.  Should be paired with
+ * dso__data_put_fd().
  */
-int dso__data_fd(struct dso *dso, struct machine *machine)
+int dso__data_get_fd(struct dso *dso, struct machine *machine)
 {
enum dso_binary_type binary_type_data[] = {
DSO_BINARY_TYPE__BUILD_ID_CACHE,
@@ -457,11 +458,11 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
};
int i = 0;
 
+   pthread_mutex_lock(__data_open_lock);
+
if (dso->data.status == DSO_DATA_STATUS_ERROR)
return -1;
 
-   pthread_mutex_lock(__data_open_lock);
-
if (dso->data.fd >= 0)
goto out;
 
@@ -484,10 +485,31 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
else
dso->data.status = DSO_DATA_STATUS_ERROR;
 
-   pthread_mutex_unlock(__data_open_lock);
return dso->data.fd;
 }
 
+void dso__data_put_fd(struct dso *dso __maybe_unused)
+{
+   pthread_mutex_unlock(__data_open_lock);
+}
+
+/**
+ * dso__data_get_fd - Get dso's data file descriptor
+ * @dso: dso object
+ * @machine: machine object
+ *
+ * Obsolete interface to find dso's file, open it and
+ * returns file descriptor.  It's not thread-safe in that
+ * the returned fd may be reused for other file.
+ */
+int dso__data_fd(struct dso *dso, struct machine *machine)
+{
+   int fd = dso__data_get_fd(dso, machine);
+
+   dso__data_put_fd(dso);
+   return fd;
+}
+
 bool dso__data_status_seen(struct dso *dso, enum dso_data_status_seen by)
 {
u32 flag = 1 << by;
@@ -1200,12 +1222,14 @@ size_t dso__fprintf(struct dso *dso, enum map_type 
type, FILE *fp)
 enum dso_type dso__type(struct dso *dso, struct machine *machine)
 {
int fd;
+   enum dso_type type = DSO__TYPE_UNKNOWN;
 
-   fd = dso__data_fd(dso, machine);
-   if (fd < 0)
-   return DSO__TYPE_UNKNOWN;
+   fd = dso__data_get_fd(dso, machine);
+   if (fd >= 0)
+   type = dso__type_fd(fd);
+   dso__data_put_fd(dso);
 
-   return dso__type_fd(fd);
+   return type;
 }
 
 int dso__strerror_load(struct dso *dso, char *buf, size_t buflen)
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index b26ec3ab1336..de9d98c44ae2 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -240,7 +240,9 @@ int __kmod_path__parse(struct kmod_path *m, const char 
*path,
 
 /*
  * The dso__data_* external interface provides following functions:
- *   dso__data_fd
+ *   dso__data_fd (obsolete)
+ *   dso__data_get_fd
+ *   dso__data_put_fd
  *   dso__data_close
  *   dso__data_size
  *   dso__data_read_offset
@@ -257,8 +259,9 @@ int __kmod_path__parse(struct kmod_path *m, const char 
*path,
  * The current usage of the dso__data_* interface is as follows:
  *
  * Get DSO's fd:
- *   int fd = dso__data_fd(dso, machine);
+ *   int fd = dso__data_get_fd(dso, machine);
  *   USE 'fd' SOMEHOW
+ *   dso__data_put_fd(dso)
  *
  * Read DSO's data:
  *   n = dso__data_read_offset(dso_0, , 0, buf, BUFSIZE);
@@ -278,6 +281,8 @@ int __kmod_path__parse(struct kmod_path *m, const char 
*path,
  * TODO
 */
 int dso__data_fd(struct dso *dso, struct machine *machine);
+int dso__data_get_fd(struct dso *dso, struct machine *machine);
+void dso__data_put_fd(struct dso *dso);
 void dso__data_close(struct dso *dso);
 
 off_t dso__data_size(struct dso *dso, struct machine *machine);
diff --git a/tools/perf/util/unwind-libunwind.c 
b/tools/perf/util/unwind-libunwind.c
index b3214f76fd43..58a9238b9b3e 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -270,13 +270,13 @@ static int read_unwind_spec_eh_frame(struct dso *dso, 
struct machine *machine,
u64 offset = dso->data.eh_frame_hdr_offset;
 
if (offset == 0) {
-   fd = dso__data_fd(dso, machine);
-   if (fd < 0)
-   return -EINVAL;
-
-   /* Check the .eh_frame section for

[PATCH 27/40] perf tools: Protect dso cache fd with a mutex

2015-05-17 Thread Namhyung Kim

When dso cache is accessed in multi-thread environment, it's possible
to close other dso->data.fd during operation due to open file limit.
Protect the file descriptors using a separate mutex.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/dso.c | 98 +--
 1 file changed, 72 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 425989e85302..8857287afc14 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -265,6 +265,7 @@ int __kmod_path__parse(struct kmod_path *m, const char 
*path,
  */
 static LIST_HEAD(dso__data_open);
 static long dso__data_open_cnt;
+static pthread_mutex_t dso__data_open_lock = PTHREAD_MUTEX_INITIALIZER;
 
 static void dso__list_add(struct dso *dso)
 {
@@ -434,7 +435,9 @@ static void check_data_close(void)
  */
 void dso__data_close(struct dso *dso)
 {
+   pthread_mutex_lock(__data_open_lock);
close_dso(dso);
+   pthread_mutex_unlock(__data_open_lock);
 }
 
 /**
@@ -457,6 +460,8 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
if (dso->data.status == DSO_DATA_STATUS_ERROR)
return -1;
 
+   pthread_mutex_lock(__data_open_lock);
+
if (dso->data.fd >= 0)
goto out;
 
@@ -479,6 +484,7 @@ int dso__data_fd(struct dso *dso, struct machine *machine)
else
dso->data.status = DSO_DATA_STATUS_ERROR;
 
+   pthread_mutex_unlock(__data_open_lock);
return dso->data.fd;
 }
 
@@ -583,7 +589,8 @@ dso_cache__memcpy(struct dso_cache *cache, u64 offset,
 }
 
 static ssize_t
-dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
+dso_cache__read(struct dso *dso, struct machine *machine,
+   u64 offset, u8 *data, ssize_t size)
 {
struct dso_cache *cache;
struct dso_cache *old;
@@ -592,11 +599,24 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, 
ssize_t size)
do {
u64 cache_offset;
 
-   ret = -ENOMEM;
-
cache = zalloc(sizeof(*cache) + DSO__DATA_CACHE_SIZE);
if (!cache)
-   break;
+   return -ENOMEM;
+
+   pthread_mutex_lock(__data_open_lock);
+
+   /*
+* dso->data.fd might be closed if other thread opened another
+* file (dso) due to open file limit (RLIMIT_NOFILE).
+*/
+   if (dso->data.fd < 0) {
+   dso->data.fd = open_dso(dso, machine);
+   if (dso->data.fd < 0) {
+   ret = -errno;
+   dso->data.status = DSO_DATA_STATUS_ERROR;
+   break;
+   }
+   }
 
cache_offset = offset & DSO__DATA_CACHE_MASK;
 
@@ -606,6 +626,11 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, 
ssize_t size)
 
cache->offset = cache_offset;
cache->size   = ret;
+   } while (0);
+
+   pthread_mutex_unlock(__data_open_lock);
+
+   if (ret > 0) {
old = dso_cache__insert(dso, cache);
if (old) {
/* we lose the race */
@@ -614,8 +639,7 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, 
ssize_t size)
}
 
ret = dso_cache__memcpy(cache, offset, data, size);
-
-   } while (0);
+   }
 
if (ret <= 0)
free(cache);
@@ -623,8 +647,8 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, 
ssize_t size)
return ret;
 }
 
-static ssize_t dso_cache_read(struct dso *dso, u64 offset,
- u8 *data, ssize_t size)
+static ssize_t dso_cache_read(struct dso *dso, struct machine *machine,
+ u64 offset, u8 *data, ssize_t size)
 {
struct dso_cache *cache;
 
@@ -632,7 +656,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
if (cache)
return dso_cache__memcpy(cache, offset, data, size);
else
-   return dso_cache__read(dso, offset, data, size);
+   return dso_cache__read(dso, machine, offset, data, size);
 }
 
 /*
@@ -640,7 +664,8 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
  * in the rb_tree. Any read to already cached data is served
  * by cached data.
  */
-static ssize_t cached_read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
+static ssize_t cached_read(struct dso *dso, struct machine *machine,
+  u64 offset, u8 *data, ssize_t size)
 {
ssize_t r = 0;
u8 *p = data;
@@ -648,7 +673,7 @@ static ssize_t cached_read(struct dso *dso, u64 offset, u8 
*data, ssize_t size)
do {
ssize_t ret;
 
-   ret = dso_cache_read(dso, offset, p, size);
+   ret = dso_cache_read(dso, machine, offset, p, size);
if

[PATCH 28/40] perf callchain: Maintain libunwind's address space in map_groups

2015-05-17 Thread Namhyung Kim

Currently the address_space was kept in thread struct but it's more
appropriate to keep it in map_groups as it's maintained with time.
Also we don't need to flush after exec since it still can be accessed
when used with an indexed data file.

Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/dwarf-unwind.c|  4 ++--
 tools/perf/util/map.c  |  5 +
 tools/perf/util/map.h  |  1 +
 tools/perf/util/thread.c   |  7 ---
 tools/perf/util/unwind-libunwind.c | 28 +---
 tools/perf/util/unwind.h   | 15 ++-
 6 files changed, 27 insertions(+), 33 deletions(-)

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 1926799bfcdb..0e572eeabdb7 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -143,6 +143,8 @@ int test__dwarf_unwind(void)
struct thread *thread;
int err = -1;
 
+   callchain_param.record_mode = CALLCHAIN_DWARF;
+
machines__init();
 
machine = machines__find(, HOST_KERNEL_ID);
@@ -151,8 +153,6 @@ int test__dwarf_unwind(void)
return -1;
}
 
-   callchain_param.record_mode = CALLCHAIN_DWARF;
-
if (init_live_machine(machine)) {
pr_err("Could not init machine\n");
goto out;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 8bc016648a34..a35fed9e5eba 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -14,6 +14,7 @@
 #include "util.h"
 #include "debug.h"
 #include "machine.h"
+#include "unwind.h"
 #include 
 
 const char *map_type__name[MAP__NR_TYPES] = {
@@ -431,6 +432,8 @@ void map_groups__init(struct map_groups *mg, struct machine 
*machine)
mg->refcnt = 1;
mg->timestamp = 0;
INIT_LIST_HEAD(>list);
+
+   unwind__prepare_access(mg);
 }
 
 static void maps__delete(struct rb_root *maps)
@@ -464,6 +467,8 @@ void map_groups__exit(struct map_groups *mg)
maps__delete(>maps[i]);
maps__delete_removed(>removed_maps[i]);
}
+
+   unwind__finish_access(mg);
 }
 
 bool map_groups__empty(struct map_groups *mg)
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index fc8cdb8853f5..a578771dd8f4 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -67,6 +67,7 @@ struct map_groups {
u64  timestamp;
int  refcnt;
struct list_head list;
+   void *priv;
 };
 
 struct map_groups *map_groups__new(struct machine *machine);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 3fa3e558316a..702f12dc5a90 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -105,9 +105,6 @@ struct thread *thread__new(pid_t pid, pid_t tid)
INIT_LIST_HEAD(>tid_node);
INIT_LIST_HEAD(>mg_list);
 
-   if (unwind__prepare_access(thread) < 0)
-   goto err_thread;
-
comm_str = malloc(32);
if (!comm_str)
goto err_thread;
@@ -153,7 +150,6 @@ void thread__delete(struct thread *thread)
list_del(>list);
comm__free(comm);
}
-   unwind__finish_access(thread);
 
free(thread);
 }
@@ -250,9 +246,6 @@ int __thread__set_comm(struct thread *thread, const char 
*str, u64 timestamp,
break;
}
list_add_tail(>list, >list);
-
-   if (exec)
-   unwind__flush_access(thread);
}
 
if (exec) {
diff --git a/tools/perf/util/unwind-libunwind.c 
b/tools/perf/util/unwind-libunwind.c
index 0978697341c1..b3214f76fd43 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -32,6 +32,7 @@
 #include "symbol.h"
 #include "util.h"
 #include "debug.h"
+#include "map.h"
 
 extern int
 UNW_OBJ(dwarf_search_unwind_table) (unw_addr_space_t as,
@@ -561,7 +562,7 @@ static unw_accessors_t accessors = {
.get_proc_name  = get_proc_name,
 };
 
-int unwind__prepare_access(struct thread *thread)
+int unwind__prepare_access(struct map_groups *mg)
 {
unw_addr_space_t addr_space;
 
@@ -575,41 +576,38 @@ int unwind__prepare_access(struct thread *thread)
}
 
unw_set_caching_policy(addr_space, UNW_CACHE_GLOBAL);
-   thread__set_priv(thread, addr_space);
+   mg->priv = addr_space;
 
return 0;
 }
 
-void unwind__flush_access(struct thread *thread)
+void unwind__finish_access(struct map_groups *mg)
 {
-   unw_addr_space_t addr_space;
+   unw_addr_space_t addr_space = mg->priv;
 
if (callchain_param.record_mode != CALLCHAIN_DWARF)
return;
 
-   addr_space = thread__priv(thread);
-   unw_flush_cache(addr_space, 0, 0);
-}
-
-void unwind__finish_access(struct thread *thread)
-{
-   unw_addr_space_t addr_space;
-
-   if

[PATCH 12/40] perf tools: Use thread__comm_by_time() when adding hist entries

2015-05-17 Thread Namhyung Kim

Now thread->comm can be handled with time properly, use it to find
correct comm when adding hist entries.

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-annotate.c |  5 +++--
 tools/perf/builtin-diff.c |  8 
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c| 19 ++-
 tools/perf/util/hist.h|  2 +-
 5 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b57a027fb200..761f902473b7 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -47,7 +47,7 @@ struct perf_annotate {
 };
 
 static int perf_evsel__add_sample(struct perf_evsel *evsel,
- struct perf_sample *sample __maybe_unused,
+ struct perf_sample *sample,
  struct addr_location *al,
  struct perf_annotate *ann)
 {
@@ -67,7 +67,8 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
 
-   he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+   he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0,
+   sample->time, true);
if (he == NULL)
return -ENOMEM;
 
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index daaa7dca9c3b..0fe54a633a5e 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -312,10 +312,10 @@ static int formula_fprintf(struct hist_entry *he, struct 
hist_entry *pair,
 
 static int hists__add_entry(struct hists *hists,
struct addr_location *al, u64 period,
-   u64 weight, u64 transaction)
+   u64 weight, u64 transaction, u64 timestamp)
 {
if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
-  transaction, true) != NULL)
+  transaction, timestamp, true) != NULL)
return 0;
return -ENOMEM;
 }
@@ -336,8 +336,8 @@ static int diff__process_sample_event(struct perf_tool 
*tool __maybe_unused,
return -1;
}
 
-   if (hists__add_entry(hists, , sample->period,
-sample->weight, sample->transaction)) {
+   if (hists__add_entry(hists, , sample->period, sample->weight,
+sample->transaction, sample->time)) {
pr_warning("problem incrementing symbol period, skipping 
event\n");
goto out_put;
}
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 8c102b011424..27bae90c9a95 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(hists, , NULL,
-   NULL, NULL, 1, 1, 0, true);
+   NULL, NULL, 1, 1, 0, -1, true);
if (he == NULL) {
addr_location__put();
goto out;
@@ -116,7 +116,7 @@ static int add_hist_entries(struct perf_evlist *evlist, 
struct machine *machine)
goto out;
 
he = __hists__add_entry(hists, , NULL,
-   NULL, NULL, 1, 1, 0, true);
+   NULL, NULL, 1, 1, 0, -1, true);
if (he == NULL) {
addr_location__put();
goto out;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 338770679863..f13993e53e4e 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -442,11 +442,11 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
  struct branch_info *bi,
  struct mem_info *mi,
  u64 period, u64 weight, u64 transaction,
- bool sample_self)
+ u64 timestamp, bool sample_self)
 {
struct hist_entry entry = {
.thread = al->thread,
-   .comm = thread__comm(al->thread),
+   .comm = thread__comm_by_time(al->thread, timestamp),
.ms = {
.map= al->map,
.sym= al->sym,
@@ -504,13 +504,14 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, 
struct addr_location *al
 {
u64 cost;
struct mem_info *mi = iter->priv;
+   struct perf_sample *sample = iter->sample;
struct hists *hists =

[PATCH 35/40] perf record: Synthesize COMM event for a command line workload

2015-05-17 Thread Namhyung Kim

When perf creates a new child to profile, the events are enabled on
exec().  And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec().  But there's an window
between the enabling and the event generation.

It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work anymore since those samples will go to a missing
thread now but the COMM event will create a (current) thread.  This
leads to those early samples (like native_write_msr_safe) not having a
comm but pid (like ':15328').

So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm.  But at this time, the
comm will be "perf" since it's not exec-ed yet.

Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-record.c | 18 +-
 tools/perf/util/event.c |  2 +-
 tools/perf/util/event.h |  5 +
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 978ebf648aab..153f38e3 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -766,8 +766,24 @@ static int __cmd_record(struct record *rec, int argc, 
const char **argv)
/*
 * Let the child rip
 */
-   if (forks)
+   if (forks) {
+   union perf_event *comm_event;
+
+   comm_event = malloc(sizeof(*comm_event) + machine->id_hdr_size);
+   if (comm_event == NULL)
+   goto out_child;
+
+   err = perf_event__synthesize_comm(tool, comm_event,
+ rec->evlist->workload.pid,
+ process_synthesized_event,
+ machine);
+   free(comm_event);
+
+   if (err < 0)
+   goto out_child;
+
perf_evlist__start_workload(rec->evlist);
+   }
 
if (opts->initial_delay) {
usleep(opts->initial_delay * 1000);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 930d45d5a37a..3adca1302150 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -165,7 +165,7 @@ static int perf_event__prepare_comm(union perf_event 
*event, pid_t pid,
return 0;
 }
 
-static pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 union perf_event *event, pid_t pid,
 perf_event__handler_t process,
 struct machine *machine)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 40e02544f861..9aacd558ac0f 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -452,6 +452,11 @@ int perf_event__synthesize_mmap_events(struct perf_tool 
*tool,
   struct machine *machine,
   bool mmap_data);
 
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+ union perf_event *event, pid_t pid,
+ perf_event__handler_t process,
+ struct machine *machine);
+
 size_t perf_event__fprintf_comm(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp);
 size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp);
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 30/40] perf session: Pass struct events stats to event processing functions

2015-05-17 Thread Namhyung Kim

Pass stats structure so that it can point separate object when used in
multi-thread environment.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/session.c | 71 ++-
 1 file changed, 45 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index bc738216de36..0d080a95d2ff 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -18,6 +18,7 @@
 #include "auxtrace.h"
 
 static int perf_session__deliver_event(struct perf_session *session,
+  struct events_stats *stats,
   union perf_event *event,
   struct perf_sample *sample,
   struct perf_tool *tool,
@@ -106,7 +107,8 @@ static int ordered_events__deliver_event(struct 
ordered_events *oe,
return ret;
}
 
-   return perf_session__deliver_event(session, event->event, ,
+   return perf_session__deliver_event(session, >evlist->stats,
+  event->event, ,
   session->tool, event->file_offset);
 }
 
@@ -942,6 +944,7 @@ static struct machine *machines__find_for_cpumode(struct 
machines *machines,
 }
 
 static int deliver_sample_value(struct perf_evlist *evlist,
+   struct events_stats *stats,
struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
@@ -957,7 +960,7 @@ static int deliver_sample_value(struct perf_evlist *evlist,
}
 
if (!sid || sid->evsel == NULL) {
-   ++evlist->stats.nr_unknown_id;
+   ++stats->nr_unknown_id;
return 0;
}
 
@@ -965,6 +968,7 @@ static int deliver_sample_value(struct perf_evlist *evlist,
 }
 
 static int deliver_sample_group(struct perf_evlist *evlist,
+   struct events_stats *stats,
struct perf_tool *tool,
union  perf_event *event,
struct perf_sample *sample,
@@ -974,7 +978,7 @@ static int deliver_sample_group(struct perf_evlist *evlist,
u64 i;
 
for (i = 0; i < sample->read.group.nr; i++) {
-   ret = deliver_sample_value(evlist, tool, event, sample,
+   ret = deliver_sample_value(evlist, stats, tool, event, sample,
   >read.group.values[i],
   machine);
if (ret)
@@ -986,6 +990,7 @@ static int deliver_sample_group(struct perf_evlist *evlist,
 
 static int
  perf_evlist__deliver_sample(struct perf_evlist *evlist,
+struct events_stats *stats,
 struct perf_tool *tool,
 union  perf_event *event,
 struct perf_sample *sample,
@@ -1002,14 +1007,15 @@ static int
 
/* For PERF_SAMPLE_READ we have either single or group mode. */
if (read_format & PERF_FORMAT_GROUP)
-   return deliver_sample_group(evlist, tool, event, sample,
+   return deliver_sample_group(evlist, stats, tool, event, sample,
machine);
else
-   return deliver_sample_value(evlist, tool, event, sample,
+   return deliver_sample_value(evlist, stats, tool, event, sample,
>read.one, machine);
 }
 
 static int machines__deliver_event(struct machines *machines,
+  struct events_stats *stats,
   struct perf_evlist *evlist,
   union perf_event *event,
   struct perf_sample *sample,
@@ -1028,14 +1034,15 @@ static int machines__deliver_event(struct machines 
*machines,
case PERF_RECORD_SAMPLE:
dump_sample(evsel, event, sample);
if (evsel == NULL) {
-   ++evlist->stats.nr_unknown_id;
+   ++stats->nr_unknown_id;
return 0;
}
if (machine == NULL) {
-   ++evlist->stats.nr_unprocessable_samples;
+   ++stats->nr_unprocessable_samples;
return 0;
}
-   return perf_evlist__deliver_sample(evlist, tool, event, sample, 
evsel, machine);
+   return perf_evlist__deliver_sample(evlist, stats, tool, event,
+  sample, evsel, machine);
case PERF_RECORD_MMAP:
return tool->mmap(tool, event, sample, machine);
case PERF_RECORD_MMAP2:
@@ -1048,7 +1055,7 @@ static int

[PATCH 34/40] perf report: Parallelize perf report using multi-thread

2015-05-17 Thread Namhyung Kim

Introduce perf_session__process_events_mt() to enable multi-thread
sample processing.  It allocates a struct perf_tool_mt and fills
needed info in it.

The session and hists event stats are counted for each thread and
summed after finishing the processing.  Similarly hist entries are
added to per-thread hists first and then move to the original hists
using hists__mt_resort().  This function reuses hists__collapse_
resort() code so makes sort__need_collapse force to true and skips
the collapsing function.

Note that most of preprocessing stage is already done by processing
meta events in dummy tracking evsel first.  We can find corresponding
thread and map based on the sample time and symbol loading and dso
cache access is protected by pthread mutex.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/hist.c|  75 +++
 tools/perf/util/hist.h|   3 +
 tools/perf/util/session.c | 153 ++
 tools/perf/util/session.h |   1 +
 tools/perf/util/tool.h|  12 
 5 files changed, 231 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index c11f7fdc08fd..1868116cdfb4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -940,7 +940,7 @@ void hist_entry__delete(struct hist_entry *he)
  * collapse the histogram
  */
 
-static bool hists__collapse_insert_entry(struct hists *hists __maybe_unused,
+static bool hists__collapse_insert_entry(struct hists *hists,
 struct rb_root *root,
 struct hist_entry *he)
 {
@@ -977,6 +977,13 @@ static bool hists__collapse_insert_entry(struct hists 
*hists __maybe_unused,
}
hists->nr_entries++;
 
+   /*
+* For multi-threaded report, he->hists points to a dummy
+* hists in the struct perf_tool_mt.  Please see
+* perf_session__process_events_mt().
+*/
+   he->hists = hists;
+
rb_link_node(>rb_node_in, parent, p);
rb_insert_color(>rb_node_in, root);
return true;
@@ -1004,19 +1011,12 @@ static void hists__apply_filters(struct hists *hists, 
struct hist_entry *he)
hists__filter_entry_by_symbol(hists, he);
 }
 
-void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
+static void __hists__collapse_resort(struct hists *hists, struct rb_root *root,
+struct ui_progress *prog)
 {
-   struct rb_root *root;
struct rb_node *next;
struct hist_entry *n;
 
-   if (!sort__need_collapse)
-   return;
-
-   hists->nr_entries = 0;
-
-   root = hists__get_rotate_entries_in(hists);
-
next = rb_first(root);
 
while (next) {
@@ -1039,6 +1039,27 @@ void hists__collapse_resort(struct hists *hists, struct 
ui_progress *prog)
}
 }
 
+void hists__collapse_resort(struct hists *hists, struct ui_progress *prog)
+{
+   struct rb_root *root;
+
+   if (!sort__need_collapse)
+   return;
+
+   hists->nr_entries = 0;
+
+   root = hists__get_rotate_entries_in(hists);
+   __hists__collapse_resort(hists, root, prog);
+}
+
+void hists__mt_resort(struct hists *dst, struct hists *src)
+{
+   struct rb_root *root = src->entries_in;
+
+   sort__need_collapse = 1;
+   __hists__collapse_resort(dst, root, NULL);
+}
+
 static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
 {
struct perf_hpp_fmt *fmt;
@@ -1268,6 +1289,29 @@ void events_stats__inc(struct events_stats *stats, u32 
type)
++stats->nr_events[type];
 }
 
+void events_stats__add(struct events_stats *dst, struct events_stats *src)
+{
+   int i;
+
+#define ADD(_field)  dst->_field += src->_field
+
+   ADD(total_period);
+   ADD(total_non_filtered_period);
+   ADD(total_lost);
+   ADD(total_invalid_chains);
+   ADD(nr_non_filtered_samples);
+   ADD(nr_lost_warned);
+   ADD(nr_unknown_events);
+   ADD(nr_invalid_chains);
+   ADD(nr_unknown_id);
+   ADD(nr_unprocessable_samples);
+
+   for (i = 0; i < PERF_RECORD_HEADER_MAX; i++)
+   ADD(nr_events[i]);
+
+#undef ADD
+}
+
 void hists__inc_nr_events(struct hists *hists, u32 type)
 {
events_stats__inc(>stats, type);
@@ -1444,16 +1488,21 @@ int perf_hist_config(const char *var, const char *value)
return 0;
 }
 
-static int hists_evsel__init(struct perf_evsel *evsel)
+void __hists__init(struct hists *hists)
 {
-   struct hists *hists = evsel__hists(evsel);
-
memset(hists, 0, sizeof(*hists));
hists->entries_in_array[0] = hists->entries_in_array[1] = RB_ROOT;
hists->entries_in = >entries_in_array[0];
hists->entries_collapsed = RB_ROOT;
hists->entries = RB_ROOT;
pthread_mutex_init(>lock, NULL);
+}
+
+static int hists_evsel__init(struct perf_evsel *evsel)
+{
+   struct hists *hists = evsel__hists(evsel);
+
+

[PATCH 39/40] perf data: Implement 'index' subcommand

2015-05-17 Thread Namhyung Kim

The index command first splits a given data file into intermediate
data files and merges them into a final data file with an index table
so that it can processed using multi threads.  The HEADER_DATA_INDEX
feature bit is added to distinguish data file that has an index table.

Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-data.txt |  25 ++-
 tools/perf/builtin-data.c  | 351 -
 2 files changed, 372 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-data.txt 
b/tools/perf/Documentation/perf-data.txt
index be8fa1a0a97e..fdac46ea6732 100644
--- a/tools/perf/Documentation/perf-data.txt
+++ b/tools/perf/Documentation/perf-data.txt
@@ -22,6 +22,11 @@ COMMANDS
like:
  perf --debug data-convert data convert ...
 
+index::
+   Build an index table for data file so that it can be processed
+   with multiple threads concurrently.
+
+
 OPTIONS for 'convert'
 -
 --to-ctf::
@@ -34,7 +39,25 @@ OPTIONS for 'convert'
 --verbose::
 Be more verbose (show counter open errors, etc).
 
+OPTIONS for 'index'
+---
+-i::
+--input::
+   Specify input perf data file path.
+
+-o::
+--output::
+   Specify output perf data directory path.
+
+-v::
+--verbose::
+Be more verbose (show counter open errors, etc).
+
+-f::
+--force::
+Don't complain, do it.
+
 SEE ALSO
 
-linkperf:perf[1]
+linkperf:perf[1], linkperf:perf-report[1]
 [1] Common Trace Format - http://www.efficios.com/ctf
diff --git a/tools/perf/builtin-data.c b/tools/perf/builtin-data.c
index d6525bc54d13..71eb6db8b7ff 100644
--- a/tools/perf/builtin-data.c
+++ b/tools/perf/builtin-data.c
@@ -2,11 +2,16 @@
 #include "builtin.h"
 #include "perf.h"
 #include "debug.h"
+#include "session.h"
+#include "evlist.h"
 #include "parse-options.h"
 #include "data-convert-bt.h"
+#include 
 
 typedef int (*data_cmd_fn_t)(int argc, const char **argv, const char *prefix);
 
+static const char *output_name;
+
 struct data_cmd {
const char  *name;
const char  *summary;
@@ -44,6 +49,15 @@ static void print_usage(void)
printf("\n");
 }
 
+static int cmd_data_convert(int argc, const char **argv, const char *prefix);
+static int data_cmd_index(int argc, const char **argv, const char *prefix);
+
+static struct data_cmd data_cmds[] = {
+   { "convert", "converts data file between formats", cmd_data_convert },
+   { "index", "merge data file and add index", data_cmd_index },
+   { .name = NULL, },
+};
+
 static const char * const data_convert_usage[] = {
"perf data convert []",
NULL
@@ -88,11 +102,342 @@ static int cmd_data_convert(int argc, const char **argv,
return 0;
 }
 
-static struct data_cmd data_cmds[] = {
-   { "convert", "converts data file between formats", cmd_data_convert },
-   { .name = NULL, },
+#define FD_HASH_BITS  7
+#define FD_HASH_SIZE  (1 << FD_HASH_BITS)
+#define FD_HASH_MASK  (FD_HASH_SIZE - 1)
+
+struct data_index {
+   struct perf_tooltool;
+   struct perf_session *session;
+   enum {
+   PER_CPU,
+   PER_THREAD,
+   } split_mode;
+   char*tmpdir;
+   int header_fd;
+   u64 header_written;
+   struct hlist_head   fd_hash[FD_HASH_SIZE];
+   int fd_hash_nr;
+   int output_fd;
 };
 
+struct fdhash_node {
+   int id;
+   int fd;
+   struct hlist_node   list;
+};
+
+static struct hlist_head *get_hash(struct data_index *idx, int id)
+{
+   return >fd_hash[id % FD_HASH_MASK];
+}
+
+static int perf_event__rewrite_header(struct perf_tool *tool,
+ union perf_event *event)
+{
+   struct data_index *idx = container_of(tool, struct data_index, tool);
+   ssize_t size;
+
+   size = writen(idx->header_fd, event, event->header.size);
+   if (size < 0)
+   return -errno;
+
+   idx->header_written += size;
+   return 0;
+}
+
+static int split_other_events(struct perf_tool *tool,
+   union perf_event *event,
+   struct perf_sample *sample __maybe_unused,
+   struct machine *machine __maybe_unused)
+{
+   return perf_event__rewrite_header(tool, event);
+}
+
+static int split_sample_event(struct perf_tool *tool,
+   union perf_event *event,
+   struct perf_sample *sample,
+   struct perf_evsel *evsel __maybe_unused,
+   struct machine *machine __maybe_unused)
+{
+   struct data_index *idx = container_of(tool, struct data_index, tool);
+   int id = idx->split_mode == PER_CPU ? sample->cpu : sample->tid;
+   int fd = -1;
+

[PATCH 32/40] perf tools: Move BUILD_ID_SIZE definition to perf.h

2015-05-17 Thread Namhyung Kim

The util/event.h includes util/build-id.h only for BUILD_ID_SIZE.
This is a problem when I include util/event.h from util/tool.h which
is also included by util/build-id.h since it now makes a circular
dependency resulting in incomplete type error.

Signed-off-by: Namhyung Kim 
---
 tools/perf/perf.h  | 1 +
 tools/perf/util/build-id.h | 2 --
 tools/perf/util/dso.h  | 1 +
 tools/perf/util/event.h| 1 -
 4 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 192d936020ea..1f12336ca5a8 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -30,6 +30,7 @@ static inline unsigned long long rdclock(void)
 }
 
 #define MAX_NR_CPUS1024
+#define BUILD_ID_SIZE  20
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index 85011222cc14..e71304c9c86f 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -1,8 +1,6 @@
 #ifndef PERF_BUILD_ID_H_
 #define PERF_BUILD_ID_H_ 1
 
-#define BUILD_ID_SIZE 20
-
 #include "tool.h"
 #include "strlist.h"
 #include 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index de9d98c44ae2..aa1f503c30bf 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include "map.h"
+#include "perf.h"
 #include "build-id.h"
 
 enum dso_binary_type {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 97179abc80a1..40e02544f861 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -6,7 +6,6 @@
 
 #include "../perf.h"
 #include "map.h"
-#include "build-id.h"
 #include "perf_regs.h"
 
 struct mmap_event {
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 36/40] perf tools: Fix progress ui to support multi thread

2015-05-17 Thread Namhyung Kim

Split ui_progress struct into global and local one.  Each thread
updates local struct without lock and only updates global one if
meaningful progress is done (with lock).

To do that, pass struct ui_progress to __perf_session__process_event()
and set it for the total size of multi-file storage.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/hist.c|  5 ++--
 tools/perf/util/hist.h|  3 +-
 tools/perf/util/session.c | 71 ++-
 tools/perf/util/tool.h|  3 ++
 4 files changed, 66 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 1868116cdfb4..eafb09e8c487 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1052,12 +1052,13 @@ void hists__collapse_resort(struct hists *hists, struct 
ui_progress *prog)
__hists__collapse_resort(hists, root, prog);
 }
 
-void hists__mt_resort(struct hists *dst, struct hists *src)
+void hists__mt_resort(struct hists *dst, struct hists *src,
+ struct ui_progress *prog)
 {
struct rb_root *root = src->entries_in;
 
sort__need_collapse = 1;
-   __hists__collapse_resort(dst, root, NULL);
+   __hists__collapse_resort(dst, root, prog);
 }
 
 static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 79e96a1adee2..811bd5e69337 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -122,7 +122,8 @@ int hist_entry__sort_snprintf(struct hist_entry *he, char 
*bf, size_t size,
 void hist_entry__delete(struct hist_entry *he);
 
 void hists__output_resort(struct hists *hists, struct ui_progress *prog);
-void hists__mt_resort(struct hists *dst, struct hists *src);
+void hists__mt_resort(struct hists *dst, struct hists *src,
+ struct ui_progress *prog);
 void hists__collapse_resort(struct hists *hists, struct ui_progress *prog);
 
 void hists__decay_entries(struct hists *hists, bool zap_user, bool zap_kernel);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 8ab65ac54258..dcb9747bbb49 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1481,7 +1481,8 @@ fetch_mmaped_event(struct perf_session *session,
 static int __perf_session__process_events(struct perf_session *session,
  struct events_stats *stats,
  u64 data_offset, u64 data_size,
- u64 file_size)
+ u64 file_size,
+ struct ui_progress *prog)
 {
struct ordered_events *oe = >ordered_events;
struct perf_tool *tool = session->tool;
@@ -1491,7 +1492,6 @@ static int __perf_session__process_events(struct 
perf_session *session,
size_t  mmap_size;
char *buf, *mmaps[NUM_MMAPS];
union perf_event *event;
-   struct ui_progress prog;
s64 skip;
 
perf_tool__fill_defaults(tool);
@@ -1503,8 +1503,6 @@ static int __perf_session__process_events(struct 
perf_session *session,
if (data_size && (data_offset + data_size < file_size))
file_size = data_offset + data_size;
 
-   ui_progress__init(, file_size, "Processing events...");
-
mmap_size = MMAP_SIZE;
if (mmap_size > file_size) {
mmap_size = file_size;
@@ -1570,7 +1568,7 @@ static int __perf_session__process_events(struct 
perf_session *session,
head += size;
file_pos += size;
 
-   ui_progress__update(, size);
+   ui_progress__update(prog, size);
 
if (session_done())
goto out;
@@ -1585,7 +1583,6 @@ static int __perf_session__process_events(struct 
perf_session *session,
goto out_err;
err = auxtrace__flush_events(session, tool);
 out_err:
-   ui_progress__finish();
ordered_events__free(>ordered_events);
auxtrace__free_events(session);
session->one_mmap = false;
@@ -1594,12 +1591,15 @@ static int __perf_session__process_events(struct 
perf_session *session,
 
 static int __perf_session__process_indexed_events(struct perf_session *session)
 {
+   struct ui_progress prog;
struct perf_data_file *file = session->file;
struct perf_tool *tool = session->tool;
u64 size = perf_data_file__size(file);
struct events_stats *stats = >evlist->stats;
int err = 0, i;
 
+   ui_progress__init(, size, "Processing events...");
+
for (i = 0; i < (int)session->header.nr_index; i++) {
struct perf_file_section *idx = >header.index[i];
 
@@ -1616,17 +1616,20 @@ static int 
__perf_session__process_indexed_events(struct perf_session *session)
 
err = __perf_session__process_events(session, stats,
 idx->offset,
-

[PATCH Not-for-merge 40/40] perf tools: Disable thread refcount due to bug

2015-05-17 Thread Namhyung Kim

This makes thread mg sharing test failed due to not decrement
thread->refcnt on thread__put().

Not-signed-off-by: Namhyung Kim 
---
 tools/perf/util/thread.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 702f12dc5a90..dc5ec9a5cca1 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -163,7 +163,7 @@ struct thread *thread__get(struct thread *thread)
 
 void thread__put(struct thread *thread)
 {
-   if (thread && atomic_dec_and_test(>refcnt)) {
+   if (thread && atomic_dec_and_test(>refcnt) && 0) {
if (!RB_EMPTY_NODE(>rb_node)) {
struct machine *machine = thread->mg->machine;
 
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 38/40] perf session: Handle index files generally

2015-05-17 Thread Namhyung Kim

The current code assumes that the number of index item and cpu are
matched so it creates that number of threads.  But it's not the case
of non-system-wide session or data came from different machine.

Just creates threads at most number of online cpus and process data.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/session.c | 79 ++-
 tools/perf/util/tool.h|  1 -
 2 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index dcb9747bbb49..5f6c319bd236 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1679,25 +1679,50 @@ static struct ui_progress_ops mt_progress__ops = {
.update = mt_progress__update,
 };
 
+static int perf_session__get_index(struct perf_session *session)
+{
+   int ret;
+   static unsigned idx = 1;
+   static pthread_mutex_t idx_lock = PTHREAD_MUTEX_INITIALIZER;
+
+   pthread_mutex_lock(_lock);
+   if (idx < session->header.nr_index)
+   ret = idx++;
+   else
+   ret = -1;
+   pthread_mutex_unlock(_lock);
+
+   return ret;
+}
+
 static void *processing_thread_idx(void *arg)
 {
struct perf_tool_mt *mt_tool = arg;
struct perf_session *session = mt_tool->session;
-   u64 offset = session->header.index[mt_tool->idx].offset;
-   u64 size = session->header.index[mt_tool->idx].size;
u64 file_size = perf_data_file__size(session->file);
+   int idx;
 
-   ui_progress__init(_tool->prog, size, "");
+   while ((idx = perf_session__get_index(session)) >= 0) {
+   u64 offset = session->header.index[idx].offset;
+   u64 size = session->header.index[idx].size;
+   struct perf_tool_mt *mtt = _tool[idx];
 
-   pr_debug("processing samples using thread [%d]\n", mt_tool->idx);
-   if (__perf_session__process_events(session, _tool->stats,
-  offset, size, file_size,
-  _tool->prog) < 0) {
-   pr_err("processing samples failed (thread [%d])\n", 
mt_tool->idx);
-   return NULL;
+   if (size == 0)
+   continue;
+
+   pr_debug("processing samples [index %d]\n", idx);
+
+   ui_progress__init(>prog, size, "");
+
+   if (__perf_session__process_events(mtt->session, >stats,
+  offset, size, file_size,
+  >prog) < 0) {
+   pr_err("processing samples failed [index %d]\n", idx);
+   return NULL;
+   }
+   pr_debug("processing samples done [index %d]\n", idx);
}
 
-   pr_debug("processing samples done for thread [%d]\n", mt_tool->idx);
return arg;
 }
 
@@ -1717,6 +1742,7 @@ int perf_session__process_events_mt(struct perf_session 
*session, void *arg)
int err, i, k;
int nr_index = session->header.nr_index;
u64 size = perf_data_file__size(file);
+   int nr_thread = sysconf(_SC_NPROCESSORS_ONLN);
 
if (perf_data_file__is_pipe(file) || !session->header.index) {
pr_err("data file doesn't contain the index table\n");
@@ -1753,15 +1779,18 @@ int perf_session__process_events_mt(struct perf_session 
*session, void *arg)
 
tool->ordered_events = false;
 
-   for (i = 1; i < nr_index; i++) {
+   for (i = 0; i < nr_index; i++) {
ms = _sessions[i];
mt = _tools[i];
 
+   ms->tool = >tool;
ms->file = session->file;
ms->evlist = session->evlist;
ms->header = session->header;
ms->tevent = session->tevent;
+   ms->machines = session->machines;
 
+   ordered_events__init(>ordered_events, NULL);
memcpy(>tool, tool, sizeof(*tool));
 
mt->hists = calloc(evlist->nr_entries, sizeof(*mt->hists));
@@ -1772,20 +1801,28 @@ int perf_session__process_events_mt(struct perf_session 
*session, void *arg)
__hists__init(>hists[k]);
 
mt->session = ms;
-   mt->idx = i;
mt->priv = arg;
mt->global_prog = 
-
-   pthread_create(_id[i], NULL, processing_thread_idx, mt);
}
 
-   for (i = 1; i < nr_index; i++) {
+   if (nr_thread > nr_index - 1)
+   nr_thread = nr_index - 1;
+
+   th_id = calloc(nr_thread, sizeof(*th_id));
+   if (th_id == NULL)
+   goto out;
+
+   for (i = 0; i < nr_thread; i++)
+   pthread_create(_id[i], NULL, processing_thread_idx, 
mt_tools);
+
+   for (i = 0; i < nr_thread; i++) {
pthread_join(th_id[i], (void **));
-   if (mt == NULL) {
+   if (mt == NULL)

[PATCH 37/40] perf report: Add --multi-thread option and config item

2015-05-17 Thread Namhyung Kim

The --multi-thread option is to enable parallel processing so user can
force serial processing even for indexed data file.  It default to false
for now but users also can changes this by setting "report.multi_thread"
config option in ~/.perfconfig file.

Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-report.txt |  2 +
 tools/perf/builtin-report.c  | 66 +++-
 2 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index c33b69f3374f..3917710e2620 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -353,6 +353,8 @@ OPTIONS
 
To disable decoding entirely, use --no-itrace.
 
+--multi-thread::
+   Speed up report by parallelizing sample processing using multi-thread.
 
 include::callchain-overhead-calculation.txt[]
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 650d78ad3357..4d08e5f0a7bb 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -53,6 +53,7 @@ struct report {
boolmem_mode;
boolheader;
boolheader_only;
+   boolmulti_thread;
int max_stack;
struct perf_read_values show_threads_values;
const char  *pretty_printing_style;
@@ -84,6 +85,10 @@ static int report__config(const char *var, const char 
*value, void *cb)
rep->queue_size = perf_config_u64(var, value);
return 0;
}
+   if (!strcmp(var, "report.multi-thread")) {
+   rep->multi_thread = perf_config_bool(var, value);
+   return 0;
+   }
 
return perf_default_config(var, value, cb);
 }
@@ -130,17 +135,18 @@ static int hist_iter__report_callback(struct 
hist_entry_iter *iter,
return err;
 }
 
-static int process_sample_event(struct perf_tool *tool,
-   union perf_event *event,
-   struct perf_sample *sample,
-   struct perf_evsel *evsel,
-   struct machine *machine)
+static int __process_sample_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine,
+ struct hists *hists,
+ struct report *rep)
 {
-   struct report *rep = container_of(tool, struct report, tool);
struct addr_location al;
struct hist_entry_iter iter = {
.evsel  = evsel,
-   .hists  = evsel__hists(evsel),
+   .hists  = hists,
.sample = sample,
.hide_unresolved= rep->hide_unresolved,
.add_entry_cb   = hist_iter__report_callback,
@@ -179,6 +185,31 @@ static int process_sample_event(struct perf_tool *tool,
return ret;
 }
 
+static int process_sample_event(struct perf_tool *tool,
+   union perf_event *event,
+   struct perf_sample *sample,
+   struct perf_evsel *evsel,
+   struct machine *machine)
+{
+   struct report *rep = container_of(tool, struct report, tool);
+
+   return __process_sample_event(tool, event, sample, evsel, machine,
+ evsel__hists(evsel), rep);
+}
+
+static int process_sample_event_mt(struct perf_tool *tool,
+  union perf_event *event,
+  struct perf_sample *sample,
+  struct perf_evsel *evsel,
+  struct machine *machine)
+{
+   struct perf_tool_mt *mt = container_of(tool, struct perf_tool_mt, tool);
+   struct report *rep = mt->priv;
+
+   return __process_sample_event(tool, event, sample, evsel, machine,
+ >hists[evsel->idx], rep);
+}
+
 static int process_read_event(struct perf_tool *tool,
  union perf_event *event,
  struct perf_sample *sample __maybe_unused,
@@ -489,7 +520,12 @@ static int __cmd_report(struct report *rep)
if (ret)
return ret;
 
-   ret = perf_session__process_events(session);
+   if (rep->multi_thread) {
+   rep->tool.sample = process_sample_event_mt;
+   ret = perf_session__process_events_mt(session, rep);
+   } else {
+   ret = perf_session__process_events(session);
+   }
if (ret)

[PATCH 33/40] perf session: Separate struct machines from session

2015-05-17 Thread Namhyung Kim

With multi-thread report, separate sessions can be passed to each
thread, in this case we should keep a single machine state for all
struct sessions.  Separate machines and have a pointer in sessions.

Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-annotate.c |  2 +-
 tools/perf/builtin-kmem.c | 10 +-
 tools/perf/builtin-kvm.c  |  2 +-
 tools/perf/builtin-record.c   |  4 ++--
 tools/perf/builtin-report.c   |  4 ++--
 tools/perf/builtin-top.c  |  8 
 tools/perf/builtin-trace.c|  2 +-
 tools/perf/util/build-id.c| 16 
 tools/perf/util/session.c | 36 +---
 tools/perf/util/session.h |  6 +++---
 10 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 761f902473b7..d4ad323ddfe2 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -196,7 +196,7 @@ static int __cmd_annotate(struct perf_annotate *ann)
struct perf_evsel *pos;
u64 total_nr_samples;
 
-   machines__set_symbol_filter(>machines, symbol__annotate_init);
+   machines__set_symbol_filter(session->machines, symbol__annotate_init);
 
if (ann->cpu_list) {
ret = perf_session__cpu_bitmap(session, ann->cpu_list,
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 254614b10c4a..4502954094e5 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -316,7 +316,7 @@ static int build_alloc_func_list(void)
struct symbol *sym;
struct rb_node *node;
struct alloc_func *func;
-   struct machine *machine = _session->machines.host;
+   struct machine *machine = _session->machines->host;
regex_t alloc_func_regex;
const char pattern[] = "^_?_?(alloc|get_free|get_zeroed)_pages?";
 
@@ -366,7 +366,7 @@ static int build_alloc_func_list(void)
 static u64 find_callsite(struct perf_evsel *evsel, struct perf_sample *sample)
 {
struct addr_location al;
-   struct machine *machine = _session->machines.host;
+   struct machine *machine = _session->machines->host;
struct callchain_cursor_node *node;
 
if (alloc_func_list == NULL) {
@@ -949,7 +949,7 @@ static void __print_slab_result(struct rb_root *root,
int n_lines, int is_caller)
 {
struct rb_node *next;
-   struct machine *machine = >machines.host;
+   struct machine *machine = >machines->host;
 
printf("%.105s\n", graph_dotted_line);
printf(" %-34s |",  is_caller ? "Callsite": "Alloc Ptr");
@@ -1010,7 +1010,7 @@ static const char * const migrate_type_str[] = {
 static void __print_page_alloc_result(struct perf_session *session, int 
n_lines)
 {
struct rb_node *next = rb_first(_alloc_sorted);
-   struct machine *machine = >machines.host;
+   struct machine *machine = >machines->host;
const char *format;
int gfp_len = max(strlen("GFP flags"), max_gfp_len);
 
@@ -1060,7 +1060,7 @@ static void __print_page_alloc_result(struct perf_session 
*session, int n_lines)
 static void __print_page_caller_result(struct perf_session *session, int 
n_lines)
 {
struct rb_node *next = rb_first(_caller_sorted);
-   struct machine *machine = >machines.host;
+   struct machine *machine = >machines->host;
int gfp_len = max(strlen("GFP flags"), max_gfp_len);
 
printf("\n%.105s\n", graph_dotted_line);
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 15fecd3dc5d8..de3c7e1d0b80 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1392,7 +1392,7 @@ static int kvm_events_live(struct perf_kvm_stat *kvm,
kvm->session->evlist = kvm->evlist;
perf_session__set_id_hdr_size(kvm->session);
ordered_events__set_copy_on_queue(>session->ordered_events, true);
-   machine__synthesize_threads(>session->machines.host, 
>opts.target,
+   machine__synthesize_threads(>session->machines->host, 
>opts.target,
kvm->evlist->threads, false);
err = kvm_live_open_events(kvm);
if (err)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4ddf104f50ff..978ebf648aab 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -684,7 +684,7 @@ static int __cmd_record(struct record *rec, int argc, const 
char **argv)
goto out_child;
}
 
-   machine = >machines.host;
+   machine = >machines->host;
 
if (file->is_pipe) {
err = perf_event__synthesize_attrs(tool, session,
@@ -735,7 +735,7 @@ static int __cmd_record(struct record *rec, int argc, const 
char **argv)
   "Check /proc/modules permission or run as root.\n");
 
if (perf_guest) {
-   machines__process_guests(>machines,
+   machines__process_guests(session->machines,

[PATCH 31/40] perf hists: Pass hists struct to hist_entry_iter struct

2015-05-17 Thread Namhyung Kim

This is a preparation for perf report multi-thread support.  When
multi-thread is enable, each thread will have its own hists during the
sample processing.

Signed-off-by: Namhyung Kim 
---
 tools/perf/builtin-report.c   |  1 +
 tools/perf/builtin-top.c  |  1 +
 tools/perf/tests/hists_cumulate.c |  1 +
 tools/perf/tests/hists_filter.c   |  1 +
 tools/perf/tests/hists_output.c   |  1 +
 tools/perf/util/hist.c| 20 +++-
 tools/perf/util/hist.h|  1 +
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index decd9e8584b5..5e53eee5a9a7 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -140,6 +140,7 @@ static int process_sample_event(struct perf_tool *tool,
struct addr_location al;
struct hist_entry_iter iter = {
.evsel  = evsel,
+   .hists  = evsel__hists(evsel),
.sample = sample,
.hide_unresolved= rep->hide_unresolved,
.add_entry_cb   = hist_iter__report_callback,
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 6b987424d015..ea6e7bd04f9a 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -776,6 +776,7 @@ static void perf_event__process_sample(struct perf_tool 
*tool,
struct hists *hists = evsel__hists(evsel);
struct hist_entry_iter iter = {
.evsel  = evsel,
+   .hists  = evsel__hists(evsel),
.sample = sample,
.add_entry_cb   = hist_iter__top_callback,
};
diff --git a/tools/perf/tests/hists_cumulate.c 
b/tools/perf/tests/hists_cumulate.c
index 7d82c8be5e36..36f31f839b96 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -88,6 +88,7 @@ static int add_hist_entries(struct hists *hists, struct 
machine *machine)
};
struct hist_entry_iter iter = {
.evsel = evsel,
+   .hists = evsel__hists(evsel),
.sample = ,
.hide_unresolved = false,
};
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index ce48775e6ada..f8077b81b618 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -64,6 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist,
};
struct hist_entry_iter iter = {
.evsel = evsel,
+   .hists = evsel__hists(evsel),
.sample = ,
.ops = _iter_normal,
.hide_unresolved = false,
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index adbebc852cc8..bf9efe145260 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -58,6 +58,7 @@ static int add_hist_entries(struct hists *hists, struct 
machine *machine)
};
struct hist_entry_iter iter = {
.evsel = evsel,
+   .hists = evsel__hists(evsel),
.sample = ,
.ops = _iter_normal,
.hide_unresolved = false,
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b492968913e1..c11f7fdc08fd 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -505,7 +505,7 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, 
struct addr_location *al
u64 cost;
struct mem_info *mi = iter->priv;
struct perf_sample *sample = iter->sample;
-   struct hists *hists = evsel__hists(iter->evsel);
+   struct hists *hists = iter->hists;
struct hist_entry *he;
 
if (mi == NULL)
@@ -535,8 +535,7 @@ static int
 iter_finish_mem_entry(struct hist_entry_iter *iter,
  struct addr_location *al __maybe_unused)
 {
-   struct perf_evsel *evsel = iter->evsel;
-   struct hists *hists = evsel__hists(evsel);
+   struct hists *hists = iter->hists;
struct hist_entry *he = iter->he;
int err = -EINVAL;
 
@@ -608,8 +607,7 @@ static int
 iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location 
*al)
 {
struct branch_info *bi;
-   struct perf_evsel *evsel = iter->evsel;
-   struct hists *hists = evsel__hists(evsel);
+   struct hists *hists = iter->hists;
struct hist_entry *he = NULL;
int i = iter->curr;
int err = 0;
@@ -656,11 +654,10 @@ iter_prepare_normal_entry(struct hist_entry_iter *iter 
__maybe_unused,
 static int
 iter_add_single_normal_entry(struct hist_entry_iter *iter, struct

[PATCH 25/40] perf tools: Protect dso symbol loading using a mutex

2015-05-17 Thread Namhyung Kim

When multi-thread support for perf report is enabled, it's possible to
access a dso concurrently.  Add a new pthread_mutex to protect it from
concurrent dso__load().

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/dso.c|  2 ++
 tools/perf/util/dso.h|  1 +
 tools/perf/util/symbol.c | 34 --
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 7078700233fa..c14d981568fd 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -936,6 +936,7 @@ struct dso *dso__new(const char *name)
RB_CLEAR_NODE(>rb_node);
INIT_LIST_HEAD(>node);
INIT_LIST_HEAD(>data.open_entry);
+   pthread_mutex_init(>lock, NULL);
}
 
return dso;
@@ -966,6 +967,7 @@ void dso__delete(struct dso *dso)
dso_cache__free(>data.cache);
dso__free_a2l(dso);
zfree(>symsrc_filename);
+   pthread_mutex_destroy(>lock);
free(dso);
 }
 
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 3d79c749934c..b26ec3ab1336 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -129,6 +129,7 @@ struct dsos {
 struct auxtrace_cache;
 
 struct dso {
+   pthread_mutex_t  lock;
struct list_head node;
struct rb_node   rb_node;   /* rbtree node sorted by long name */
struct rb_root   symbols[MAP__NR_TYPES];
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 5d69d3c407e6..6c88c607930f 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1383,12 +1383,22 @@ int dso__load(struct dso *dso, struct map *map, 
symbol_filter_t filter)
struct symsrc *syms_ss = NULL, *runtime_ss = NULL;
bool kmod;
 
-   dso__set_loaded(dso, map->type);
+   pthread_mutex_lock(>lock);
+
+   /* check again under the dso->lock */
+   if (dso__loaded(dso, map->type)) {
+   ret = 1;
+   goto out;
+   }
+
+   if (dso->kernel) {
+   if (dso->kernel == DSO_TYPE_KERNEL)
+   ret = dso__load_kernel_sym(dso, map, filter);
+   else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
+   ret = dso__load_guest_kernel_sym(dso, map, filter);
 
-   if (dso->kernel == DSO_TYPE_KERNEL)
-   return dso__load_kernel_sym(dso, map, filter);
-   else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
-   return dso__load_guest_kernel_sym(dso, map, filter);
+   goto out;
+   }
 
if (map->groups && map->groups->machine)
machine = map->groups->machine;
@@ -1401,18 +1411,18 @@ int dso__load(struct dso *dso, struct map *map, 
symbol_filter_t filter)
struct stat st;
 
if (lstat(dso->name, ) < 0)
-   return -1;
+   goto out;
 
if (st.st_uid && (st.st_uid != geteuid())) {
pr_warning("File %s not owned by current user or root, "
"ignoring it.\n", dso->name);
-   return -1;
+   goto out;
}
 
ret = dso__load_perf_map(dso, map, filter);
dso->symtab_type = ret > 0 ? DSO_BINARY_TYPE__JAVA_JIT :
 DSO_BINARY_TYPE__NOT_FOUND;
-   return ret;
+   goto out;
}
 
if (machine)
@@ -1420,7 +1430,7 @@ int dso__load(struct dso *dso, struct map *map, 
symbol_filter_t filter)
 
name = malloc(PATH_MAX);
if (!name)
-   return -1;
+   goto out;
 
kmod = dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
@@ -1501,7 +1511,11 @@ int dso__load(struct dso *dso, struct map *map, 
symbol_filter_t filter)
 out_free:
free(name);
if (ret < 0 && strstr(dso->name, " (deleted)") != NULL)
-   return 0;
+   ret = 0;
+out:
+   dso__set_loaded(dso, map->type);
+   pthread_mutex_unlock(>lock);
+
return ret;
 }
 
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 23/40] perf tools: Use map_groups__find_addr_by_time()

2015-05-17 Thread Namhyung Kim

Use timestamp to find a corresponding map so that it can find a match
symbol eventually.

Cc: Stephane Eranian 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/event.c  | 81 ++--
 tools/perf/util/thread.c |  8 +++--
 2 files changed, 77 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index bb391c20920d..930d45d5a37a 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -832,12 +832,11 @@ int perf_event__process(struct perf_tool *tool 
__maybe_unused,
return machine__process_event(machine, event, sample);
 }
 
-static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
- enum map_type type, u64 addr,
- struct addr_location *al)
+static bool map_groups__set_addr_location(struct map_groups *mg,
+ struct addr_location *al,
+ u8 cpumode, u64 addr)
 {
struct machine *machine = mg->machine;
-   bool load_map = false;
 
al->machine = machine;
al->addr = addr;
@@ -846,21 +845,17 @@ static void map_groups__find_addr_map(struct map_groups 
*mg, u8 cpumode,
 
if (machine == NULL) {
al->map = NULL;
-   return;
+   return true;
}
 
BUG_ON(mg == NULL);
 
if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
al->level = 'k';
-   mg = >kmaps;
-   load_map = true;
} else if (cpumode == PERF_RECORD_MISC_USER && perf_host) {
al->level = '.';
} else if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest) {
al->level = 'g';
-   mg = >kmaps;
-   load_map = true;
} else if (cpumode == PERF_RECORD_MISC_GUEST_USER && perf_guest) {
al->level = 'u';
} else {
@@ -876,8 +871,27 @@ static void map_groups__find_addr_map(struct map_groups 
*mg, u8 cpumode,
!perf_host)
al->filtered |= (1 << HIST_FILTER__HOST);
 
+   return true;
+   }
+   return false;
+}
+
+static void map_groups__find_addr_map(struct map_groups *mg, u8 cpumode,
+ enum map_type type, u64 addr,
+ struct addr_location *al)
+{
+   struct machine *machine = mg->machine;
+   bool load_map = false;
+
+   if (map_groups__set_addr_location(mg, al, cpumode, addr))
return;
+
+   if ((cpumode == PERF_RECORD_MISC_KERNEL && perf_host) ||
+   (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest)) {
+   mg = >kmaps;
+   load_map = true;
}
+
 try_again:
al->map = map_groups__find(mg, type, al->addr);
if (al->map == NULL) {
@@ -908,6 +922,53 @@ static void map_groups__find_addr_map(struct map_groups 
*mg, u8 cpumode,
}
 }
 
+static void map_groups__find_addr_map_by_time(struct map_groups *mg, u8 
cpumode,
+ enum map_type type, u64 addr,
+ struct addr_location *al,
+ u64 timestamp)
+{
+   struct machine *machine = mg->machine;
+   bool load_map = false;
+
+   if (map_groups__set_addr_location(mg, al, cpumode, addr))
+   return;
+
+   if ((cpumode == PERF_RECORD_MISC_KERNEL && perf_host) ||
+   (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest)) {
+   mg = >kmaps;
+   load_map = true;
+   }
+
+try_again:
+   al->map = map_groups__find_by_time(mg, type, al->addr, timestamp);
+   if (al->map == NULL) {
+   /*
+* If this is outside of all known maps, and is a negative
+* address, try to look it up in the kernel dso, as it might be
+* a vsyscall or vdso (which executes in user-mode).
+*
+* XXX This is nasty, we should have a symbol list in the
+* "[vdso]" dso, but for now lets use the old trick of looking
+* in the whole kernel symbol list.
+*/
+   if (cpumode == PERF_RECORD_MISC_USER && machine &&
+   mg != >kmaps &&
+   machine__kernel_ip(machine, al->addr)) {
+   mg = >kmaps;
+   load_map = true;
+   goto try_again;
+   }
+   } else {
+   /*
+* Kernel maps might be changed when loading symbols so loading
+* must be done prior to using kernel maps.
+*/
+   if (load_map)
+   map__load(al->map, machine->symbol_filter);
+   al->addr = al->map->map_ip(al->map, al->addr);
+

[PATCH 08/40] perf record: Add --index option for building index table

2015-05-17 Thread Namhyung Kim

The new --index option will create indexed data file which can be
processed by multiple threads parallelly.  It saves meta event and
sample data in separate files and merges them with an index table.

If there's an index table in the data file, the HEADER_DATA_INDEX
feature bit is set and session->header.index[0] will point to the meta
event area, and rest are sample data.  It'd look like below:

+-+
| file header |
|-|
| |
|meta events[0] <-+--+
| |  |
|-|  |
| |  |
|sample data[1] <-+--+
| |  |
|-|  |
| |  |
|sample data[2] <-|--+
| |  |
|-|  |
| ... | ...
|-|  |
| feature data|  |
|   (contains index) -+--+
+-+

Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-record.txt |   4 +
 tools/perf/builtin-record.c  | 172 ---
 tools/perf/perf.h|   1 +
 tools/perf/util/header.c |   2 +
 tools/perf/util/session.c|   1 +
 5 files changed, 166 insertions(+), 14 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 280533ebf9df..7eac31f02f8c 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -267,6 +267,10 @@ AUX area tracing event. Optionally the number of bytes to 
capture per
 snapshot can be specified. In Snapshot Mode, trace data is captured only when
 signal SIGUSR2 is received.
 
+--index::
+Build an index table for sample data.  This will speed up perf report by
+parallel processing.
+
 SEE ALSO
 
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 303116c9a38a..4ddf104f50ff 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -40,6 +40,7 @@ struct record {
u64 bytes_written;
struct perf_data_file   file;
struct auxtrace_record  *itr;
+   int *fds;
struct perf_evlist  *evlist;
struct perf_session *session;
const char  *progname;
@@ -49,9 +50,16 @@ struct record {
longsamples;
 };
 
-static int record__write(struct record *rec, void *bf, size_t size)
+static int record__write(struct record *rec, void *bf, size_t size, int idx)
 {
-   if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+   int fd;
+
+   if (rec->fds && idx >= 0)
+   fd = rec->fds[idx];
+   else
+   fd = perf_data_file__fd(rec->session->file);
+
+   if (writen(fd, bf, size) < 0) {
pr_err("failed to write perf data, error: %m\n");
return -1;
}
@@ -66,7 +74,7 @@ static int process_synthesized_event(struct perf_tool *tool,
 struct machine *machine __maybe_unused)
 {
struct record *rec = container_of(tool, struct record, tool);
-   return record__write(rec, event, event->header.size);
+   return record__write(rec, event, event->header.size, -1);
 }
 
 static int record__mmap_read(struct record *rec, int idx)
@@ -91,7 +99,7 @@ static int record__mmap_read(struct record *rec, int idx)
size = md->mask + 1 - (old & md->mask);
old += size;
 
-   if (record__write(rec, buf, size) < 0) {
+   if (record__write(rec, buf, size, idx) < 0) {
rc = -1;
goto out;
}
@@ -101,7 +109,7 @@ static int record__mmap_read(struct record *rec, int idx)
size = head - old;
old += size;
 
-   if (record__write(rec, buf, size) < 0) {
+   if (record__write(rec, buf, size, idx) < 0) {
rc = -1;
goto out;
}
@@ -149,6 +157,7 @@ static int record__process_auxtrace(struct perf_tool *tool,
struct perf_data_file *file = >file;
size_t padding;
u8 pad[8] = {0};
+   int idx = event->auxtrace.idx;
 
if (!perf_data_file__is_pipe(file)) {
off_t file_offset;
@@ -169,11 +178,11 @@ static int record__process_auxtrace(struct perf_tool 
*tool,
if (padding)
padding = 8 - padding;
 
-   record__write(rec, event, event->header.size);
-   record__write(rec, data1, len1);
+   record__write(rec, event, event->header.size, idx);
+   record__write(rec, data1, len1, idx);
if (len2)
-   record__write(rec, data2, len2);
-   record__write(rec, ,

[PATCH 24/40] perf tools: Add testcase for managing maps with time

2015-05-17 Thread Namhyung Kim

This tests new map_groups__{insert,find}_by_time() API working
correctly by using 3 * 100 maps.

Cc: Stephane Eranian 
Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/Build |  1 +
 tools/perf/tests/builtin-test.c|  4 ++
 tools/perf/tests/tests.h   |  1 +
 tools/perf/tests/thread-map-time.c | 90 ++
 4 files changed, 96 insertions(+)
 create mode 100644 tools/perf/tests/thread-map-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index cfd59c61bcd2..43bf2abf8c5a 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -28,6 +28,7 @@ perf-y += thread-comm.o
 perf-y += thread-mg-share.o
 perf-y += thread-lookup-time.o
 perf-y += thread-mg-time.o
+perf-y += thread-map-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index c5dbeb3d75b1..ba3a7e5650e0 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -183,6 +183,10 @@ static struct test {
.func = test__thread_mg_time,
},
{
+   .desc = "Test thread map lookup with time",
+   .func = test__thread_map_lookup_time,
+   },
+   {
.func = NULL,
},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index a2e1f729ae23..0363a2c9526b 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -64,6 +64,7 @@ int test__kmod_path__parse(void);
 int test__thread_comm(void);
 int test__thread_lookup_time(void);
 int test__thread_mg_time(void);
+int test__thread_map_lookup_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-map-time.c 
b/tools/perf/tests/thread-map-time.c
new file mode 100644
index ..6f28975faeb5
--- /dev/null
+++ b/tools/perf/tests/thread-map-time.c
@@ -0,0 +1,90 @@
+#include "debug.h"
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+
+#define PERF_MAP_START  0x4
+#define LIBC_MAP_START  0x8
+#define VDSO_MAP_START  0x7F000
+
+#define NR_MAPS  100
+
+static int lookup_maps(struct map_groups *mg)
+{
+   struct map *map;
+   int i, ret = -1;
+   size_t n;
+   struct {
+   const char *path;
+   u64 start;
+   } maps[] = {
+   { "/usr/bin/perf",  PERF_MAP_START },
+   { "/usr/lib/libc.so",   LIBC_MAP_START },
+   { "[vdso]", VDSO_MAP_START },
+   };
+
+   /* this is needed to insert/find map by time */
+   perf_has_index = true;
+
+   for (n = 0; n < ARRAY_SIZE(maps); n++) {
+   for (i = 0; i < NR_MAPS; i++) {
+   map = map__new2(maps[n].start, dso__new(maps[n].path),
+   MAP__FUNCTION, i * 1);
+   if (map == NULL) {
+   pr_debug("memory allocation failed\n");
+   goto out;
+   }
+
+   map->end = map->start + 0x1000;
+   map_groups__insert_by_time(mg, map);
+   }
+   }
+
+   if (verbose > 1)
+   map_groups__fprintf(mg, stderr);
+
+   for (n = 0; n < ARRAY_SIZE(maps); n++) {
+   for (i = 0; i < NR_MAPS; i++) {
+   u64 timestamp = i * 1;
+
+   map = map_groups__find_by_time(mg, MAP__FUNCTION,
+  maps[n].start,
+  timestamp);
+
+   TEST_ASSERT_VAL("cannot find map", map);
+   TEST_ASSERT_VAL("addr not matched",
+   map->start == maps[n].start);
+   TEST_ASSERT_VAL("pathname not matched",
+   !strcmp(map->dso->name, maps[n].path));
+   TEST_ASSERT_VAL("timestamp not matched",
+   map->timestamp == timestamp);
+   }
+   }
+
+   ret = 0;
+out:
+   return ret;
+}
+
+/*
+ * This test creates large number of overlapping maps for increasing
+ * time and find a map based on timestamp.
+ */
+int test__thread_map_lookup_time(void)
+{
+   struct machines machines;
+   struct machine *machine;
+   struct thread *t;
+   int ret;
+
+   machines__init();
+   machine = 
+
+   t = machine__findnew_thread(machine, 0, 0);
+
+   ret = lookup_maps(t->mg);
+
+   machine__delete_threads(machine);
+   return ret;
+}
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read

[PATCH 22/40] perf tools: Introduce map_groups__{insert,find}_by_time()

2015-05-17 Thread Namhyung Kim

It'll manage maps using timestamp so that it can find correct
map/symbol for sample at a certain time.  With this API, it can
maintain overlapping maps in a map_groups.

Cc: Stephane Eranian 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/map.c | 52 +++
 tools/perf/util/map.h | 25 +
 2 files changed, 77 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index fef4e38ccd93..8bc016648a34 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -746,6 +746,33 @@ void maps__insert(struct rb_root *maps, struct map *map)
rb_insert_color(>rb_node, maps);
 }
 
+void maps__insert_by_time(struct rb_root *maps, struct map *map)
+{
+   struct rb_node **p = >rb_node;
+   struct rb_node *parent = NULL;
+   const u64 ip = map->start;
+   const u64 timestamp = map->timestamp;
+   struct map *m;
+
+   while (*p != NULL) {
+   parent = *p;
+   m = rb_entry(parent, struct map, rb_node);
+   if (ip < m->start)
+   p = &(*p)->rb_left;
+   else if (ip > m->start)
+   p = &(*p)->rb_right;
+   else if (timestamp > m->timestamp)
+   p = &(*p)->rb_left;
+   else if (timestamp < m->timestamp)
+   p = &(*p)->rb_right;
+   else
+   BUG_ON(1);
+   }
+
+   rb_link_node(>rb_node, parent, p);
+   rb_insert_color(>rb_node, maps);
+}
+
 void maps__remove(struct rb_root *maps, struct map *map)
 {
rb_erase(>rb_node, maps);
@@ -771,6 +798,31 @@ struct map *maps__find(struct rb_root *maps, u64 ip)
return NULL;
 }
 
+struct map *maps__find_by_time(struct rb_root *maps, u64 ip, u64 timestamp)
+{
+   struct rb_node **p = >rb_node;
+   struct rb_node *parent = NULL;
+   struct map *m;
+   struct map *best = NULL;
+
+   while (*p != NULL) {
+   parent = *p;
+   m = rb_entry(parent, struct map, rb_node);
+   if (ip < m->start)
+   p = &(*p)->rb_left;
+   else if (ip >= m->end)
+   p = &(*p)->rb_right;
+   else if (timestamp >= m->timestamp) {
+   if (!best || best->timestamp < m->timestamp)
+   best = m;
+   p = &(*p)->rb_left;
+   } else
+   p = &(*p)->rb_right;
+   }
+
+   return best;
+}
+
 struct map *maps__first(struct rb_root *maps)
 {
struct rb_node *first = rb_first(maps);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index c1ed11a3e16c..fc8cdb8853f5 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include "perf.h"  /* for perf_has_index */
+
 enum map_type {
MAP__FUNCTION = 0,
MAP__VARIABLE,
@@ -166,8 +168,10 @@ void map__reloc_vmlinux(struct map *map);
 size_t __map_groups__fprintf_maps(struct map_groups *mg, enum map_type type,
  FILE *fp);
 void maps__insert(struct rb_root *maps, struct map *map);
+void maps__insert_by_time(struct rb_root *maps, struct map *map);
 void maps__remove(struct rb_root *maps, struct map *map);
 struct map *maps__find(struct rb_root *maps, u64 addr);
+struct map *maps__find_by_time(struct rb_root *maps, u64 addr, u64 timestamp);
 struct map *maps__first(struct rb_root *maps);
 struct map *maps__next(struct map *map);
 void map_groups__init(struct map_groups *mg, struct machine *machine);
@@ -185,6 +189,17 @@ static inline void map_groups__insert(struct map_groups 
*mg, struct map *map)
map->groups = mg;
 }
 
+static inline void map_groups__insert_by_time(struct map_groups *mg,
+ struct map *map)
+{
+   if (perf_has_index)
+   maps__insert_by_time(>maps[map->type], map);
+   else
+   maps__insert(>maps[map->type], map);
+
+   map->groups = mg;
+}
+
 static inline void map_groups__remove(struct map_groups *mg, struct map *map)
 {
maps__remove(>maps[map->type], map);
@@ -196,6 +211,16 @@ static inline struct map *map_groups__find(struct 
map_groups *mg,
return maps__find(>maps[type], addr);
 }
 
+static inline struct map *map_groups__find_by_time(struct map_groups *mg,
+  enum map_type type, u64 addr,
+  u64 timestamp)
+{
+   if (!perf_has_index)
+   return maps__find(>maps[type], addr);
+
+   return maps__find_by_time(>maps[type], addr, timestamp);
+}
+
 static inline struct map *map_groups__first(struct map_groups *mg,
enum map_type type)
 {
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to

[PATCH 15/40] perf tools: Add a test case for timed thread handling

2015-05-17 Thread Namhyung Kim

A test case for verifying live and dead thread tree management during
time change and new machine__find{,new}_thread_time().

Cc: Frederic Weisbecker 
Signed-off-by: Namhyung Kim 
---
 tools/perf/tests/Build|   1 +
 tools/perf/tests/builtin-test.c   |   4 +
 tools/perf/tests/tests.h  |   1 +
 tools/perf/tests/thread-lookup-time.c | 179 ++
 4 files changed, 185 insertions(+)
 create mode 100644 tools/perf/tests/thread-lookup-time.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 78d29a3a6a97..5ad495823b49 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -26,6 +26,7 @@ perf-y += sw-clock.o
 perf-y += mmap-thread-lookup.o
 perf-y += thread-comm.o
 perf-y += thread-mg-share.o
+perf-y += thread-lookup-time.o
 perf-y += switch-tracking.o
 perf-y += keep-tracking.o
 perf-y += code-reading.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 372b6395a448..e83c7ce1b38a 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -175,6 +175,10 @@ static struct test {
.func = test__thread_comm,
},
{
+   .desc = "Test thread lookup with time",
+   .func = test__thread_lookup_time,
+   },
+   {
.func = NULL,
},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index aa269eff798a..e9aa78c3d8fc 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -62,6 +62,7 @@ int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__kmod_path__parse(void);
 int test__thread_comm(void);
+int test__thread_lookup_time(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/thread-lookup-time.c 
b/tools/perf/tests/thread-lookup-time.c
new file mode 100644
index ..0133a241b9fc
--- /dev/null
+++ b/tools/perf/tests/thread-lookup-time.c
@@ -0,0 +1,179 @@
+#include "tests.h"
+#include "machine.h"
+#include "thread.h"
+#include "map.h"
+#include "debug.h"
+
+static int thread__print_cb(struct thread *th, void *arg __maybe_unused)
+{
+   printf("thread: %d, start time: %"PRIu64" %s\n",
+  th->tid, th->start_time,
+  th->dead ? "(dead)" : th->exited ? "(exited)" : "");
+   return 0;
+}
+
+static int lookup_with_timestamp(struct machine *machine)
+{
+   struct thread *t1, *t2, *t3;
+   union perf_event fork_event = {
+   .fork = {
+   .pid = 0,
+   .tid = 0,
+   .ppid = 1,
+   .ptid = 1,
+   },
+   };
+   struct perf_sample sample = {
+   .time = 5,
+   };
+
+   /* this is needed to keep dead threads in rbtree */
+   perf_has_index = true;
+
+   /* start_time is set to 0 */
+   t1 = machine__findnew_thread(machine, 0, 0);
+
+   if (verbose > 1) {
+   printf("= after t1 created ==\n");
+   machine__for_each_thread(machine, thread__print_cb, NULL);
+   }
+
+   TEST_ASSERT_VAL("wrong start time of old thread", t1->start_time == 0);
+
+   TEST_ASSERT_VAL("cannot find current thread",
+   machine__find_thread(machine, 0, 0) == t1);
+
+   TEST_ASSERT_VAL("cannot find current thread with time",
+   machine__findnew_thread_by_time(machine, 0, 0, 1) 
== t1);
+
+   /* start_time is overwritten to new value */
+   thread__set_comm(t1, "/usr/bin/perf", 2);
+
+   if (verbose > 1) {
+   printf("= after t1 set comm ==\n");
+   machine__for_each_thread(machine, thread__print_cb, NULL);
+   }
+
+   TEST_ASSERT_VAL("failed to update start time", t1->start_time == 2);
+
+   TEST_ASSERT_VAL("should not find passed thread",
+   /* this will create yet another dead thread */
+   machine__findnew_thread_by_time(machine, 0, 0, 1) 
!= t1);
+
+   TEST_ASSERT_VAL("cannot find overwritten thread with time",
+   machine__find_thread_by_time(machine, 0, 0, 2) == 
t1);
+
+   /* now t1 goes to dead thread tree, and create t2 */
+   machine__process_fork_event(machine, _event, );
+
+   if (verbose > 1) {
+   printf("= after t2 forked ==\n");
+   machine__for_each_thread(machine, thread__print_cb, NULL);
+   }
+
+   t2 = machine__find_thread(machine, 0, 0);
+
+   TEST_ASSERT_VAL("cannot find current thread", t2 != NULL);
+
+   TEST_ASSERT_VAL("wrong start time of new thread", t2->start_time == 
5);
+
+   TEST_ASSERT_VAL("dead thread cannot be found",
+   machine__find_thread_by_time(machine, 0, 0, 1) != 
t1);
+
+   TEST_ASSERT_VAL("cannot find dead thread

[PATCH 26/40] perf tools: Protect dso cache tree using dso->lock

2015-05-17 Thread Namhyung Kim

The dso cache is accessed during dwarf callchain unwind and it might
be processed concurrently when multi-thread report is enabled.
Protect it under dso->lock.

Note that it doesn't protect dso_cache__find().  I think it's safe to
access to the cache tree without the lock since we don't delete nodes.
It it missed an existing node due to rotation, it'll find it during
dso_cache__insert() anyway.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/dso.c | 34 +++---
 1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index c14d981568fd..425989e85302 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -495,10 +495,12 @@ bool dso__data_status_seen(struct dso *dso, enum 
dso_data_status_seen by)
 }
 
 static void
-dso_cache__free(struct rb_root *root)
+dso_cache__free(struct dso *dso)
 {
+   struct rb_root *root = >data.cache;
struct rb_node *next = rb_first(root);
 
+   pthread_mutex_lock(>lock);
while (next) {
struct dso_cache *cache;
 
@@ -507,10 +509,12 @@ dso_cache__free(struct rb_root *root)
rb_erase(>rb_node, root);
free(cache);
}
+   pthread_mutex_unlock(>lock);
 }
 
-static struct dso_cache *dso_cache__find(const struct rb_root *root, u64 
offset)
+static struct dso_cache *dso_cache__find(struct dso *dso, u64 offset)
 {
+   const struct rb_root *root = >data.cache;
struct rb_node * const *p = >rb_node;
const struct rb_node *parent = NULL;
struct dso_cache *cache;
@@ -529,17 +533,20 @@ static struct dso_cache *dso_cache__find(const struct 
rb_root *root, u64 offset)
else
return cache;
}
+
return NULL;
 }
 
-static void
-dso_cache__insert(struct rb_root *root, struct dso_cache *new)
+static struct dso_cache *
+dso_cache__insert(struct dso *dso, struct dso_cache *new)
 {
+   struct rb_root *root = >data.cache;
struct rb_node **p = >rb_node;
struct rb_node *parent = NULL;
struct dso_cache *cache;
u64 offset = new->offset;
 
+   pthread_mutex_lock(>lock);
while (*p != NULL) {
u64 end;
 
@@ -551,10 +558,17 @@ dso_cache__insert(struct rb_root *root, struct dso_cache 
*new)
p = &(*p)->rb_left;
else if (offset >= end)
p = &(*p)->rb_right;
+   else
+   goto out;
}
 
rb_link_node(>rb_node, parent, p);
rb_insert_color(>rb_node, root);
+
+   cache = NULL;
+out:
+   pthread_mutex_unlock(>lock);
+   return cache;
 }
 
 static ssize_t
@@ -572,6 +586,7 @@ static ssize_t
 dso_cache__read(struct dso *dso, u64 offset, u8 *data, ssize_t size)
 {
struct dso_cache *cache;
+   struct dso_cache *old;
ssize_t ret;
 
do {
@@ -591,7 +606,12 @@ dso_cache__read(struct dso *dso, u64 offset, u8 *data, 
ssize_t size)
 
cache->offset = cache_offset;
cache->size   = ret;
-   dso_cache__insert(>data.cache, cache);
+   old = dso_cache__insert(dso, cache);
+   if (old) {
+   /* we lose the race */
+   free(cache);
+   cache = old;
+   }
 
ret = dso_cache__memcpy(cache, offset, data, size);
 
@@ -608,7 +628,7 @@ static ssize_t dso_cache_read(struct dso *dso, u64 offset,
 {
struct dso_cache *cache;
 
-   cache = dso_cache__find(>data.cache, offset);
+   cache = dso_cache__find(dso, offset);
if (cache)
return dso_cache__memcpy(cache, offset, data, size);
else
@@ -964,7 +984,7 @@ void dso__delete(struct dso *dso)
 
dso__data_close(dso);
auxtrace_cache__free(dso->auxtrace_cache);
-   dso_cache__free(>data.cache);
+   dso_cache__free(dso);
dso__free_a2l(dso);
zfree(>symsrc_filename);
pthread_mutex_destroy(>lock);
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 >

1 - 100 of 604 matches

Mail list logo