[lttng-dev] Compile out of src tree

2013-01-15 Thread paul . chavent
Hi.

Just a little patch in order to succeed building out of src tree.

Regards.

Paul.

diff --git a/tests/tools/health/Makefile.am b/tests/tools/health/Makefile.am
index 26d2461..1424f63 100644
--- a/tests/tools/health/Makefile.am
+++ b/tests/tools/health/Makefile.am
@@ -1,4 +1,4 @@
-AM_CFLAGS = -I. -O2 -g -I../../../include
+AM_CFLAGS = -I$(srcdir) -O2 -g -I$(top_srcdir)/include
 AM_LDFLAGS =
 
 if LTTNG_TOOLS_BUILD_WITH_LIBDL




___
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Compile out of src tree

2013-01-15 Thread Mathieu Desnoyers
* paul.chav...@fnac.net (paul.chav...@fnac.net) wrote:
 Hi.
 
 Just a little patch in order to succeed building out of src tree.

Good point!

Acked-by: Mathieu Desnoyers mathieu.desnoy...@efficios.com

 
 Regards.
 
 Paul.
 
 diff --git a/tests/tools/health/Makefile.am b/tests/tools/health/Makefile.am
 index 26d2461..1424f63 100644
 --- a/tests/tools/health/Makefile.am
 +++ b/tests/tools/health/Makefile.am
 @@ -1,4 +1,4 @@
 -AM_CFLAGS = -I. -O2 -g -I../../../include
 +AM_CFLAGS = -I$(srcdir) -O2 -g -I$(top_srcdir)/include
  AM_LDFLAGS =
  
  if LTTNG_TOOLS_BUILD_WITH_LIBDL
 
 
 
 
 ___
 lttng-dev mailing list
 lttng-dev@lists.lttng.org
 http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] [lttng-modules PATCH] Add uprobes support

2013-01-15 Thread Mathieu Desnoyers
* Yannick Brosseau (yannick.bross...@gmail.com) wrote:
 The added support is basic. It create an event with no data associated
 to the file path + offset specified.
 As per the structures currently used we cannot pass a file path bigger
 than 256 chars.

Using a pointer to a user-space string will allow you to overcome this
limitation. e.g. passing through the kernel ABI:

char name[LTTNG_KERNEL_SYM_NAME_LEN]
char __user *path;
uint64_t offset;


 
 Signed-off-by: Yannick Brosseau yannick.bross...@gmail.com
 ---
  README |5 +-
  lttng-abi.c|3 +
  lttng-abi.h|7 +++
  lttng-events.c |   18 ++
  lttng-events.h |   38 
  probes/Makefile|5 ++
  probes/lttng-uprobes.c |  156 
 
  7 files changed, 231 insertions(+), 1 deletion(-)
  create mode 100644 probes/lttng-uprobes.c
 
 diff --git a/README b/README
 index 1bcd5b2..7afdaa3 100644
 --- a/README
 +++ b/README
 @@ -9,7 +9,7 @@ need for additional patches. Other features:
  - Produces CTF (Common Trace Format) natively,
(http://www.efficios.com/ctf)
  - Tracepoints, Function tracer, CPU Performance Monitoring Unit (PMU)
 -  counters, kprobes, and kretprobes support,
 +  counters, kprobes, kretprobes and uprobes support,
  - Integrated interface for both kernel and userspace tracing,
  - Have the ability to attach context information to events in the
trace (e.g. any PMU counter, pid, ppid, tid, comm name, etc).
 @@ -86,6 +86,9 @@ CONFIG_KPROBES:
  CONFIG_KRETPROBES:
  Dynamic function entry/return probe.
 lttng enable-event -k --function ...
 +CONFIG_UPROBES:
 +Dynamic userspace probe.
 +   lttng enable-event -k --uprobe ...

It would be great to implement this feature with integrated lookup of
file/line number (using dwarf). I think perf already has some code that
does it. Adding Masami in the loop would be very relevant. In terms of
usability it will make all different in the world. Of course, this won't
change the ABI nor lttng-modules implementation, only the lttng-tools
side.

  
  
  * Note about Perf PMU counters support
 diff --git a/lttng-abi.c b/lttng-abi.c
 index 25a350a..fe428b2 100644
 --- a/lttng-abi.c
 +++ b/lttng-abi.c
 @@ -610,6 +610,9 @@ int lttng_abi_create_event(struct file *channel_file,
   case LTTNG_KERNEL_FUNCTION:
   event_param-u.ftrace.symbol_name[LTTNG_KERNEL_SYM_NAME_LEN - 
 1] = '\0';
   break;
 + case LTTNG_KERNEL_UPROBE:
 + event_param-u.uprobe.path[LTTNG_KERNEL_SYM_NAME_LEN - 1] = 
 '\0';
 + break;
   default:
   break;
   }
 diff --git a/lttng-abi.h b/lttng-abi.h
 index 8d3ecdd..5abdb39 100644
 --- a/lttng-abi.h
 +++ b/lttng-abi.h
 @@ -34,6 +34,7 @@ enum lttng_kernel_instrumentation {
   LTTNG_KERNEL_KRETPROBE  = 3,
   LTTNG_KERNEL_NOOP   = 4,/* not hooked */
   LTTNG_KERNEL_SYSCALL= 5,
 + LTTNG_KERNEL_UPROBE = 6,
  };
  
  /*
 @@ -79,6 +80,11 @@ struct lttng_kernel_function_tracer {
   char symbol_name[LTTNG_KERNEL_SYM_NAME_LEN];
  }__attribute__((packed));
  
 +struct lttng_kernel_uprobe {
 + char path[LTTNG_KERNEL_SYM_NAME_LEN];
 + uint64_t offset;
 +}__attribute__((packed));
 +
  /*
   * For syscall tracing, name = '\0' means enable all.
   */
 @@ -94,6 +100,7 @@ struct lttng_kernel_event {
   struct lttng_kernel_kretprobe kretprobe;
   struct lttng_kernel_kprobe kprobe;
   struct lttng_kernel_function_tracer ftrace;
 + struct lttng_kernel_uprobe uprobe;
   char padding[LTTNG_KERNEL_EVENT_PADDING2];
   } u;
  }__attribute__((packed));
 diff --git a/lttng-events.c b/lttng-events.c
 index 4f30904..38c901f 100644
 --- a/lttng-events.c
 +++ b/lttng-events.c
 @@ -392,6 +392,16 @@ struct lttng_event *lttng_event_create(struct 
 lttng_channel *chan,
   if (!event-desc)
   goto register_error;
   break;
 + case LTTNG_KERNEL_UPROBE:
 + ret = lttng_uprobes_register(event_param-name,
 + event_param-u.uprobe.path,
 + event_param-u.uprobe.offset,
 + event);
 + if (ret)
 + goto register_error;
 + ret = try_module_get(event-desc-owner);
 + WARN_ON_ONCE(!ret);
 + break;
   default:
   WARN_ON_ONCE(1);
   }
 @@ -443,6 +453,10 @@ int _lttng_event_unregister(struct lttng_event *event)
   case LTTNG_KERNEL_NOOP:
   ret = 0;
   break;
 + case LTTNG_KERNEL_UPROBE:
 + lttng_uprobes_unregister(event);
 + ret = 0;
 + break;
   default:
   WARN_ON_ONCE(1);
   }
 @@ -473,6 +487,10 @@ void _lttng_event_destroy(struct lttng_event *event)
   break;

Re: [lttng-dev] [rp] [RFC PATCH urcu] Add last output parameter to pop/dequeue

2013-01-15 Thread Paul E. McKenney
[Sorry for the delay, finally getting back to this.]

On Mon, Dec 17, 2012 at 09:40:09AM -0500, Mathieu Desnoyers wrote:
 * Paul E. McKenney (paul...@linux.vnet.ibm.com) wrote:
  On Thu, Dec 13, 2012 at 06:44:56AM -0500, Mathieu Desnoyers wrote:
   I noticed that in addition to having:
   
   - push/enqueue returning whether the stack/queue was empty prior to the
 operation,
   - pop_all/splice, by nature, emptying the stack/queue,
   
   it can be interesting to make pop/dequeue operations return whether they
   are returning the last element of the stack/queue (therefore emptying
   it). This allow extending the test-cases covering the number of empty
   stack/queue encountered by both push/enqueuer and pop/dequeuer threads
   not only to push/enqueue paired with pop_all/splice, but also to
   pop/dequeue.
   
   In the case of wfstack, this unfortunately requires to modify an already
   exposed API. As a RFC, one question we should answer is how we want to
   handle the way forward: should we add new functions to the wfstack API
   and leave the existing ones alone ? 
   
   Thoughts ?
  
  Hmmm...  What is the use case, given that a push might happen immediately
  after the pop said that the stack/queue was empty?  Of course, if we
  somehow know that there are no concurrent pushes, we could instead
  check for empty.
  
  So what am I missing here?
 
 The setup for those use-cases is the following (I'm using the stack as
 example, but the same applies to queue):
 
 - we have N threads doing push and using the push return value that
   states whether it pushed into an empty stack.
 - we have M threads doing pop, using the return value to know if it
   pops a stack into an empty-stack-state. Following the locking
   requirements, we protect those M threads'pop by a mutex, but they
   don't need to be protected against push.
 
 Just to help understanding where the idea comes from, let's start with
 another use-case that is similar (push/pop_all). Knowing whether we
 pushed into an empty stack along with pop_all become very useful when
 you want to combine the stack with a higher level batching semantic
 linked to the elements present within the stack.
 
 In the case of grace period batching, for instance, I used
 push/pop_all to provide this kind of semantic: if we push into an
 empty stack, we know we will have to go through the grace period. If we
 are pushed into a non-empty stack, we just wait to be awakened by the
 first thread which was pushed into the stack. This requires that we use
 pop_all before going though the grace period.
 
 Now more specifically about pop, one use-case I have in mind is
 energy-efficient handling of empty stacks. With M threads executing
 pop, let's suppose we want them to be blocked on a futex when there is
 nothing to do. Now the tricky part is: how can we do this without adding
 overhead (extra load/stores) to the stack ?
 
 If we have the ability to know whether we are popping the last element
 of a stack, we can use this information to go into a futex wait state
 after having handled the last element. Since the threads doing push
 would monitor whether they push into an empty stack, they would wake us
 whenever needed.
 
 If instead we choose to simply wait until one of the M threads discovers
 that the stack is actually empty, we are issuing extra pop (which
 fails) each time the stack is empty. In the worse-case, if a queue
 always flip between 0 and 1 elements, we double the number of pop
 needed to handle the same amount of nodes.
 
 Otherwise, if we choose to add an explicit check to see whether the
 stack is empty, we are adding an extra load of the head node for every
 pop.
 
 Another use-case I see is low-overhead monitoring of stack usage
 efficiency. For this kind of use-case, we might want to know, both
 within push and pop threads, if we are underutilizing our system
 resources. Having the ability to know that we are reaching empty state
 without any extra overhead to stack memory traffic gives us this
 ability.
 
 I must admit that the use-cases for returning whether pop takes the last
 element is not as strong as the batching case with push/pop_all, mainly
 because AFAIU, we can achieve the same result by doing an extra check of
 stack emptiness state (either by an explicit empty() check, or by
 issuing an extra pop that will see an empty stack). What we are saving
 here is the extra overhead on stack cache-lines cause by this extra
 check.
 
 Another use-case, although maybe less compelling, is for validation.
 With concurrent threads doing push/pop/pop_all operations on the stack,
 we can perform the following check: If we empty the stack at the end of
 test execution, the
 
   number of push-to-empty-stack
 
   must be equal to the
 
   number of pop_all-from-non-empty-stack
+ number of pop-last-element-from-non-empty-stack
 
 We should note that this validation could not be performed if pop is
 not returning whether it popped the last 

Re: [lttng-dev] [PATCH] Add ACCESS_ONCE() to avoid compiler splitting assignments

2013-01-15 Thread Mathieu Desnoyers
* Paul E. McKenney (paul...@linux.vnet.ibm.com) wrote:
 As noted by Konstantin Khlebnikov, gcc can split assignment of
 constants to long variables (https://lkml.org/lkml/2013/1/15/141),
 though assignment of NULL (0) is OK.  Assuming that a gcc bug is
 fixed (http://gcc.gnu.org/bugzilla/attachment.cgi?id=29169action=diff
 has a patch), making the store be volatile keeps gcc from splitting.
 
 This commit therefore applies ACCESS_ONCE() to CMM_STORE_SHARED(),
 which is the underlying primitive used by rcu_assign_pointer().

Hi Paul,

I recognise that this is an issue in the Linux kernel, since a simple
store is used and expected to be performed atomically when aligned.
However, I think this does not affect liburcu, see below:

 
 Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
 
 diff --git a/urcu/system.h b/urcu/system.h
 index 2a45f22..7a1887e 100644
 --- a/urcu/system.h
 +++ b/urcu/system.h
 @@ -49,7 +49,7 @@
   */
  #define CMM_STORE_SHARED(x, v)   \
   ({  \
 - __typeof__(x) _v = _CMM_STORE_SHARED(x, v); \
 + __typeof__(x) CMM_ACCESS_ONCE(_v) = _CMM_STORE_SHARED(x, v);
 \

Here, the macro _CMM_STORE_SHARED(x, v) is doing the actual store.
It stores v into x. So adding a CMM_ACCESS_ONCE(_v), as you propose
here, is really only making sure the return value (usually unused),
located on the stack, is accessed with a volatile access, which does not
make much sense.

What really matters is the _CMM_STORE_SHARED() macro:

#define _CMM_STORE_SHARED(x, v) ({ CMM_ACCESS_ONCE(x) = (v); })

which already uses a volatile access for the store. So this seems to be
a case where our preemptive use of volatile for stores in addition to
loads made us bug-free for a gcc behavior unexpected at the time we
implemented this macro. Just a touch of paranoia seems to be a good
thing sometimes. ;-)

Thoughts ?

Thanks,

Mathieu

   cmm_smp_wmc();  \
   _v; \
   })
 
 
 ___
 lttng-dev mailing list
 lttng-dev@lists.lttng.org
 http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

___
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev